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PROCEDURES AND MATERIALS FOR CONFERRING 
DISEASE RESISTANCE IN PLANTS 

This application is related to U.S. Patent Application No. 08/587,680, filed 
January 17, 1996, which is a continuation in part of copending U.S. patent application No. 
08/567,375, filed December 4, 1995, which is a continuation in part of U.S. provisional 
patent application No. , 60/004,645. The *680 application is also a continuation in part of 
copending U.S. patent application No. 08/475,891, filed June 7, 1995, which is a 
continuation in part of copending U.S. patent application No. 08/373,374, filed January 
17, 1995. These applications are incorporated herein by reference. 

Field Of The Invention 

The present invention relates generally to plant molecular biology. In 
particular, it relates to nucleic acids and methods for conferring disease resistance in 
plants. 

Statement as to Rights to Inventions Made Under 
Federally Sponsored Research and Development 

This invention was made with Government support under Grant No. 
GM47907, awarded by the National Institutes of Health and Grant No. 9300834, awarded 
by the United States Department of Agriculture. The Government has certain rights in this 
invention. 

BACKGROUND OF THE INVENTION 
Loci conferring disease resistance have been identified in many plant 
species. Genetic analysis of many plant-pathogen interactions has demonstrated that plants 
contain loci that confer resistance against specific races of a pathogen containing a 
complementary avirulence gene. Molecular characterization of these genes should provide 
means for conferring disease resistance to a wide variety of crop plants. 
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Those plant resistance genes that have been characterized at the molecular 
level fall into four classes. One gene, Hml in corn, encodes a reductase and is effective 
against the fungal pathogen Cochliobolus carbonum (Johal et al Science 258:985-987 
(1992)). In tomato, the Pto gene confers resistance against Pseudomonas syringae that 
express the avrPto avirulence gene (Martin et al Science 262:1432 (1993)). The 
predicted Pto gene encodes a serine threonine protein kinase. The tomato Cf-9 gene 
confers resistance to races of the fungus Cladosporium fulvum that carry the avirulence 
gene Avr9 (Jones et al Science 266:789- 793 (1994). The tomato Cf-9 gene encodes a 
putatitive extracellular LRR protein. Finally, the RPS2 gene of Arabidopsis thaliana 
confers resistance to P. syringae that express the avrRptl avirulence gene (Bent et al 
Science 265: 1856-1860 (1994)). RPs2 encodes a protein with an LRR motif and a P-loop 
motif. 

Bacterial blight disease caused by Xanthomonas spp. infects virtually all 
crop plants and leads to extensive crop losses worldwide. Bacterial blight disease of rice 
(Oryza saliva), caused by Xanthomonas oryzae pv. oryzae {Xoo), is an important disease 
of this crop. Races of Xoo that induce resistant or susceptible reactions on rice cultivars 
with distinct resistance (Xa) genes have been identified. One source of resistance (Xa21) 
had been identified in the wild species Oryza longistaminata (Khush et al in Proceedings 
of the International Workshop on Bacterial Blight of Rice. (International Rice Research 
Institute, 1989) and Ikeda et al Jpn J. Breed 40 (Suppl.l):280-281 (1990)). Xall is a 
dominant resistance locus that confers resistance to all known isolates of Xoo and is the 
only characterized Xa gene that carries resistance to Xoo race 6. Genetic and physical 
analysis of the Xall locus has identified a number of tightly linked markers on 
chromosome 11 (Ronald et al Mol Gen. Genet. 236:113-120 (1992)). The molecular 
mechanisms by which the Xall locus confers resistance to this pathogen were not 
identified, however. 

Considerable effort has been directed toward cloning plant genes conferring 
resistance to a variety of bacterial, fungal and viral diseases. Only one pest resistance 
gene has been cloned in monocots. Since monocot crops feed most humans and animals in 
the world, the identification of disease resistance genes in these plants is particularly 
important. The present invention addresses these and other needs. 
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SUMMARY OF THE INVENTION 



The present invention provides isolated nucleic acid constructs comprising 
an RRK polynucleotide sequence. The sequences can be rice sequences which hybridize to 
SEQ ID NOs: 1, 4, 6, 8, 10, or 11 under stringent conditions. Also claimed are 
5 sequences from cassava which hybdridize to SEQ ID NO: 13), maize sequences which 
hybridize to SEQ ID NOs: 15, 16), and tomato (e.g. , SEQ ID NOs: 17, 19, or 21). 
Exemplary RRK polynucleotide sequences are Xa21 sequences which encode an Xa21 
polypeptide as shown below. The RRK polynucleotides encode a protein having a leucine 
rich repeat motif and/or a cytoplasmic protein kinase domain. The nucleic acid constructs 
10 of the invention may further comprise a promoter operably linked to the RRK 

polynucleotide sequence. The promoter may be a tissue-specific promoter or a constitutive 
promoter. 

The invention also provides nucleic acid constructs comprising a promoter 

sequence from an RRK gene linked to a heterologous polynucleotide sequence. Exemplary 
15 heterologous polynucleotide sequences include structural genes which confer pathogen 

resistance on plants. 

The invention further provides transgenic plants comprising a recombinant 

expression cassette comprising a promoter from an RRK gene operably linked to a 

polynucleotide sequence as well as transgenic plants comprising a recombinant expression 
20 cassette comprising a plant promoter operably linked to an RRK polynucleotide sequence. 

Although any plant can be used in the invention, rice and tomato plants may be 

conveniently used. 

The invention further provides methods of enhancing resistance to 

Xanthomonas and other pathogens in a plant. The methods comprise introducing into the 
25 plant a recombinant expression cassette comprising a plant promoter operably linked to an 

RRK polynucleotide sequence. The methods may be conveniently carried out with rice or 

tomato plants. 



Definitions 



30 



The term "plant" includes whole plants, plant organs (e.g., leaves, stems, 
roots, etc.), seeds and plant cells and progeny of same. The class of plants which can be 
used in the methods of the invention is generally as broad as the class of higher plants 



SUBSTITUTE SHEET (RULE 26) 




WO 99/09151 PCT/US98/14841 

4 

amenable to transformation techniques, including both monocotyledonous and 
dicotyledonous plants. 

A "heterologous sequence" is one that originates from a foreign species, or, 
if from the same species, is substantially modified from its original form. For example, a 
promoter operably linked to a heterologous structural gene is from a species different from 
that from which the structural gene was derived, or, if from the same species, one or both 
are substantially modified from their original form. 

An n RRK gene" is member of a new class of disease resistance genes which 
encode RRK polypeptides which typically comprise an extracellular LRR domain, a 
transmembrane domain, and a cytoplasmic protein kinase domain (as shown in e.g. , Pto 
and Fen (Martin et aL Plant Cell 6: 1543-1552 (1994)). As used herein, an LRR domain 
is a region of a repeated unit of about 24 residues as described in USSN 08/587,680, and 
found in CJf-9). Using the sequences disclosed here and standard nucleic acid hybridization 
and/or amplification techniques, one of skill can identify members of this class of genes. 
For instance, a nucleic acid probe from an Xall gene detected polymorphisms that 
segregated with the blast (Pyricularia oryzae) resistance gene (Pi7) in 58 recombinant 
inbred lines of rice. The same probe also detected polymorphism in nearly isogenic lines 
carrying xa5 and XalO resistance genes. 

In some preferred embodiments, members of this class of disease resistance 
genes can be identified by their ability to be amplified by degenerate PCR primers which 
correspond to the LRR and kinase domains. For instance, primers have been used to 
isolate homologous genes in tomato, maize and cassava. The maize gene disclosed here 
has been genetically mapped to a region associated with resistance to Helminthosporium 
turcicum. Exemplary primers for this purpose are tcaagcaacaatttgtcaggnca (a/g) at (a/c/t) 
cc (for the LRR domain sequence GQIP) and taacagcacattgcttgatttnan (g/a) tcncg (g/a) tg 
(the kinase domain sequence HCDIK). These or equivalent primers are then used to 
amplify the appropriate nucleic acid using the PCR conditions described below. 

An "Xa21 polynucleotide sequence" is a subsequence or full length 
polynucleotide sequence of an Xall gene, such as the rice Xall gene, which, when 
present in a transgenic plant confers resistance to Xanthomonas spp. (e.g., X. oryzae) on 
the plant. Exemplary polynucleotides of the invention include the coding region of the 
sequences provided below. An Xall polynucleotide is typically at least about 3100 
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nucleotides to about 6500 nucleotides in length, usually from about 4000 to about 4500 
nucleotides. 

An "Xa21 polypeptide" is a gene product of an Xa21 polynucleotide 
sequence, which has the activity of Xa21, i.e., the ability to confer resistance to 
Xamhomonas spp. Xa21 polypeptides, like other RRK polypeptides, are characterized by 
the presence of an extracellular domain comprising a region of leucine rich repeats (LRR) 
and/or a cytoplasmic protein kinase domain. Exemplary Xa21 polypeptides of the 
invention include those described below. 

In the expression of transgenes one of skill will recognize that the inserted 
polynucleotide sequence need not be identical and may be "substantially identical" to a 
sequence of the gene from which it was derived. As explained below, these variants are 
specifically covered by this term. 

In the case where the inserted polynucleotide sequence is transcribed and 
translated to produce a functional RRK polypeptide, one of skill will recognize that 
because of codon degeneracy, a number of polynucleotide sequences will encode the same 
pdlypleptide. These variants are specifically covered by the term "RRK polynucleotide 
sequence". In addition, the term specifically includes those full length sequences 
substantially identical (determined as described below) with an RRK gene sequence and 
that encode proteins that retain the function of the RRK protein. Thus, in the case of rice 
RRK genes disclosed here, the above term includes variant polynucleotide sequences which 
have substantial identity with the sequences disclosed here and which encode proteins 
capable of conferring resistance to Xamhomonas or other plant diseases and pests on a 
transgenic plant comprising the sequence. 

Two polynucleotides or polypeptides are said to be "identical" if the 
sequence of nucleotides or amino acid residues, respectively, in the two sequences is the 
same when aligned for maximum correspondence as described below. The term 
"complementary to" is used herein to mean that the complementary sequence is identical to 
all or a portion of a reference polynucleotide sequence. 

Sequence comparisons between two (or more) polynucleotides or 
polypeptides are typically performed by comparing sequences of the two sequences over a 
segment or "comparison window" to identify and compare local regions of sequence 
similarity. Hie segment used for purposes of comparison may be at least about 20 
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contiguous positions, usually about 50 to about 200, more usually about 100 to about 150 
in which a sequence may be compared to a reference sequence of the same number of 
contiguous positions after the two sequences are optimally aligned. 

Optimal alignment of sequences for comparison may be conducted by the 
local homology algorithm of Smith and Waterman Adv. Appl. Math. 2: 482 (1981), by the 
homology alignment algorithm of Needleman and Wunsch J. MoL Biol. 48:443 (1970), by 
the search for similarity method of Pearson and Lipman Proc. Natl. Acad. Sci. (U.S. A.) 
85: 2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, 
FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer 
Group (GCG), 575 Science Dr., Madison, WI), or by inspection. 
"Percentage of sequence identity" is determined by comparing two optimally aligned 
sequences over a comparison window, wherein the portion of the polynucleotide sequence 
in the comparison window may comprise additions or deletions (i.e., gaps) as compared to 
the reference sequence (which does not comprise additions or deletions) for optimal 
alignment of the two sequences. The percentage is calculated by determining the number 
of positions at which the identical nucleic acid base or amino acid residue occurs in both 
sequences to yield the number of matched positions, dividing the number of matched 
positions by the total number of positions in the window of comparison and multiplying 
the result by 100 to yield the percentage of sequence identity. 

The term "substantial identity" of polynucleotide sequences means that a 
polynucleotide comprises a sequence that has at least 60% sequence identity, preferably at 
least 80%, more preferably at least 90% and most preferably at least 95%, compared to a 
reference sequence using the programs described above (preferably BESTFIT) using 
standard parameters. One of skill will recognize that these values can be appropriately 
adjusted to determine corresponding identity of proteins encoded by two nucleotide 
sequences by taking into account codon degeneracy, amino acid similarity, reading frame 
positioning and the like. Substantial identity of amino acid sequences for these purposes 
normally means sequence identity of at least 40%, preferably at least 60%, more 
preferably at least 90%, and most preferably at least 95%. Polypeptides which are 
"substantially similar" share sequences as noted above except that residue positions which 
are not identical may differ by conservative amino acid changes. Conservative amino acid 
substitutions refer to the interchangeability of residues having similar side chains. For 
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example, a group of amino acids having aliphatic side chains is glycine, alanine, valine, 
leucine, and isoleucine; a group of amino acids having aliphatic-hydroxyl side chains is 
serine and threonine; a group of amino acids having amide-containing side chains is 
asparagine and glutamine; a group of amino acids having aromatic side chains is 
phenylalanine, tyrosine, and tryptophan; a group of amino acids having basic side chains is 
lysine, arginine, and histidine; and a group of amino acids having sulfur-containing side 
chains is cysteine and methionine. Preferred conservative amino acids substitution groups 
are: valine-leucine-isoleucine, phenylalanine-tyrosine, lysine-arginine, alanine-valine, and 
asparag ine-glutamine . 

Another indication that nucleotide sequences are substantially identical is if 
two molecules hybridize to each other under appropriate conditions. Appropriate 
conditions can be high or low stringency and will be different in different circumstances. 
Generally, stringent conditions are selected to be about 5°C to about 20°C lower than the 
thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. 
The Tm is the temperature (under defined ionic strength and pH) at which 50% of the 
target sequence hybridizes to a perfectly matched probe. Typically, stringent wash 
conditions are those in which the salt concentration is about 0.02 molar at pH 7 and the 
temperature is at least about 60°C. However, nucleic acids which do not hybridize to 
each other under stringent conditions are still substantially identical if the polypeptides 
which they encode are substantially identical. This may occur, e.g., when a copy of a 
nucleic acid is created using the maximum codon degeneracy permitted by the genetic 
code. For Southern hybridizations, high stringency wash conditions will include at least 
one wash in 0. IX SSC at 65°C. 

Nucleic acids of the invention can be identified from a cDNA or genomic 
library prepared according to standard procedures and the nucleic acids disclosed here 
(typically at least 100 nucleotides to about full length) used as a probe. Low stringency 
hybridization conditions will typically include at least one wash using 2X SSC at 65 °C. 
The washes are preferrably followed by a subsequent wash using IX SSC at 65 °C. 

As used herein, a homolog of a particular RRK gene {e.g. , the rice Xa21 
genes disclosed here) is a second gene (either in the same species or in a different species) 
which encodes a protein having an amino acid sequence having at least 25 % identity or 
45% similarity to (determined as described above) to a polypeptide sequence in the first 
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It is believed that, in general, homologs share a common evolutionary past. 



BRIEF DESCRIPTION OF THE DRAWINGS 
Figure 1 shows the genome organization of the seven Xa21 family members 
and location of 14 transposon-like elements. Cosmid and BAC clones carrying the family 
members are designated. Wide bars represent predicted coding regions, fine bars 
represent noncoding regions, introns are indicated by angled lines, and the non-sequenced 
regions are shown by straight lines. A gap in the sequence of BAC9 is indicated by "//", 
Letters refer to names of Xa21 gene family members and arrows indicate direction of 
ORFs. The 14 transposon-like elements are numbered and represented by closed triangles. 

Figure 2A shows the HC region of the sequenced Xa21 gene family 
members. Wide bars represent predicted coding regions, and fine bars represent 
non-coding regions. Start and stop codons are indicated. The 5 1 flanking regions and 
downstream regions are grouped into four and two groups, respectively, and are shown in 
different colors based on sequence identity. The percentage of DNA sequence identity 
between promoter regions and between classes is shown to the left and right, respectively. 
The HC region is indicated by a black bar. 

Figure 2B is a schematic diagram showing a comparison of the predicted 
amino acid sequences of XA21 and Al. Domains are numbered as follows: I, Presumed 
signal peptide; II, presumed N terminus; III, LRR; VI, charged; V, presumed 
transmembrane; VI charged; VII juxtamembrane; VIII, serine/threonine kinase; IX, 
carboxy tail. The numbers below each domain indicate amino acid identity between XA21 
and Al. 

Figure 3 A shows family member D and insertion position of Retrofit. 
Retrofit carries long terminal repeats (LTRs) (small arrows) and a single, large ORF, 
encoding a protein with the following domains: gag, protease (PR), integrase (IN), reverse 
transcriptase (RT), and RNase H (RH). The large arrow indicates direction of the ORF. 

Figure 3B shows family member E and insertion position of Truncator. 
Arrows mark the orientation of the inverted repeats. The deduced amino acid sequences 
of the tomato resistance genes Cf9 and Pto are shown below. In both Figures 3A and 3B, 
the insertion elements are designated by a hatched bar. The presumed deduced amino acid 
sequences of members D and E are shown by shaded rectangles. Domains representations 
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are as described in the legend to Figure 2. 

Figure 4 shows intergenic recombination break point in the Xall family 
members. Boxes represent the ORFs of the designated family members, while narrow 
boxes represent flanking regions. Same colors indicate a high level of sequence 
homology. The nucleotides of the presumed recombination break points are indicated in 
large and bold type. Sequences surrounding the recombination break point are also 
shown. 

DESCRIPTION OF THE PREFERRED EMBODIMENTS 
This invention relates to plant RRK genes, such as the Xall genes of rice. 
Nucleic acid sequences from RRK genes, in particular Xall genes, can be used to confer 
resistance to Xanthomonas and other pathogens in plants. The invention has use in 
conferring resistance in all higher plants susceptible to pathogen infection. The invention 
thus has use over a broad range of types of plants, including species from the genera 
Juglans, Fragaria, Lotus, Medicago, Onobrychis, Trifolium, Trigonella, Vigna, Citrus, 
Linum, Geranium, Manihot, Daucus, Arabidopsis, Brassica, Raphanus, Sinapis, Atropa, 
Capsicum, Datura, Hyoscyamus, Lycopersicon, Nicotiana, Solanum, Petunia, Digitalis, 
Majorana, Ciahorium, Helianthus, Lactuca, Bromus, Asparagus, Antirrhinum, 
Heterocallis, Nemesis, Pelargonium, Panieum f Pennisetum, Ranunculus, Senecio, 
Salpiglossis, Cucumis, Browaalia, Glycine, Pisum, Phaseolus, Lolium, Zea, Avena, 
Hordeum, Secale, Triticum, and, Sorghum. 

The Example section below, which describes the isolation and 
characterization of RRK genes in rice, casava, maize and tomato. The methods used to 
isolate these genes are exemplary of a general approach for isolating Xall genes and other 
RRK genes. The isolated genes can then be used to construct recombinant vectors for 
transferring RRK gene expression to transgenic plants. 

Generally, the nomenclature and the laboratory procedures in recombinant 
DNA technology described below are those well known and commonly employed in the 
art. Standard techniques are used for cloning, DNA and RNA isolation, amplification and 
purification. Generally enzymatic reactions involving DNA ligase, DNA polymerase, 
restriction endonucleases and the like are performed according to the manufacturer's 
specifications. These techniques and various other techniques are generally performed 
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according to Sambrook et al. , Molecular Cloning - A Laboratory Manual, Cold Spring 
Harbor Laboratory, Cold Spring Harbor, New York, (1989). 

The isolation of Xa21 and related RRK genes may be accomplished by a 
number of techniques. For instance, oligonucleotide probes based on the sequences 
disclosed here can be used to identify the desired gene in a cDNA or genomic DNA 
library. To construct genomic libraries, large segments of genomic DNA are generated by 
random fragmentation, e.g. using restriction endonucleases, and are ligated with vector 
DNA to form concatemers that can be packaged into the appropriate vector. To prepare a 
cDNA library, mRNA is isolated from the desired organ, such as leaf and a cDNA library 
which contains the RRK gene transcript is prepared from the mRNA. Alternatively, 
cDNA may be prepared from mRNA extracted from other tissues in which RRK genes or 
homologs are expressed. 

The cDNA or genomic library can then be screened using a probe (typically 
a degenerate probe) based upon the sequence of a cloned RRK gene such as rice Xall 
genes disclosed here. Probes may be used to hybridize with genomic DNA or cDNA 
sequences to isolate homologous genes in the same or different plant species. 

Alternatively, the nucleic acids of interest can be amplified from nucleic 
acid samples using amplification techniques. For instance, polymerase chain reaction 
(PCR) technology to amplify the sequences of the RRK and related genes directly from 
genomic DNA, from cDNA, from genomic libraries or cDNA libraries. PCR and other in 
vitro amplification methods may also be useful, for example, to clone nucleic acid 
sequences that code for proteins to be expressed, to make nucleic acids to use as probes for 
detecting the presence of the desired mRNA in samples, for nucleic acid sequencing, or 
for other purposes. 

Appropriate primers and probes for identifying RRK sequences from plant 
tissues are generated from comparisons of the sequences provided herein. For a general 
overview of PCR see PCR Protocols: A Guide to Methods and Applications. (Innis, M, 
Gelfand, D., Sninsky, J. and White, T„ eds.), Academic Press, San Diego (1990), 
incorporated herein by reference. 

Polynucleotides may also be synthesized by well-known techniques as 
described in the technical literature. See, e.g., Carruthers et al. y Cold Spring Harbor 
Symp. Quant. Biol 47:411-418 (1982), and Adams etal., J. Am. Chem. Soc. 105:661 
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(1983). Double stranded DNA fragments may then be obtained either by synthesizing the 
complementary strand and annealing the strands together under appropriate conditions, or 
by adding the complementary strand using DNA polymerase with an appropriate primer 
sequence. 

5 Isolated sequences prepared as described herein can then be used to provide 

RRK gene expression and therefore Xanthomonas resistance in desired plants. One of skill 
will recognize that the nucleic acid encoding a functional RRK protein need not have a 
sequence identical to the exemplified gene disclosed here. In addition, the polypeptides 
encoded by the RRK genes, like other proteins, have different domains which perform 

10 different functions. Thus, the RRK gene sequences need not be full length, so long as the 
desired functional domain of the protein is expressed. As explained in detail below, the 
proteins of the invention comprise an extracellular leucine rich repeat domain, as well as 
an intracellular kinase domain. Modified protein chains can also be readily designed 
utilizing various recombinant DNA techniques well known to those skilled in the art. For 

15 example, the chains can vary from the naturally occurring sequence at the primary 

structure level by amino acid substitutions, additions, deletions, and the like. Modification 
can also include swapping domains from the proteins of the invention with related domains 
from other pest resistance genes. For example, the extra cellular domain (including the 
leucine rich repeat region) of the proteins of the invention can be replaced by that of the 

20 tomato Cf-9 gene and thus provide resistance to fungal pathogens of rice. These 

modifications can be used in a number of combinations to produce the final modified 
protein chain. 

To use isolated RRK sequences in the above techniques, recombinant DNA 
vectors suitable for transformation of plant cells are prepared. Techniques for 

25 transforming a wide variety of higher plant species are well known and described in the 
technical and scientific literature. See, for example, Weising et al. Ann. Rev. Genet. 
22:421-477 (1988). 

A DNA sequence coding for the desired RRK polypeptide, for example a 
cDNA or a genomic sequence encoding a full length protein, will be used to construct a 

30 recombinant expression cassette which can be introduced into the desired plant. An 
expression cassette will typically comprise the RRK polynucleotide operably linked to 
transcriptional and translational initiation regulatory sequences which will direct the 
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transcription of the sequence from the RRK gene in the intended tissues of the transformed 
plant. 

For example, a plant promoter fragment may be employed which will direct 
expression of the RRK in all tissues of a regenerated plant. Such promoters are referred to 
5 herein as "constitutive" promoters and are active under most environmental conditions and 
states of development or cell differentiation. Examples of constitutive promoters include 
the cauliflower mosaic virus (CaMV) 35S transcription initiation region, the 1'- or 2 # - 
promoter derived from T-DNA of Agrobacterium tumafaciens, and other transcription 
initiation regions from various plant genes known to those of skill. 

10 Alternatively, the plant promoter may direct expression of the RRK gene in 

a specific tissue or may be otherwise under more precise environmental or developmental 
control. Such promoters are referred to here as "inducible" promoters. Examples of 
environmental conditions that may effect transcription by inducible promoters include 
pathogen attack, anaerobic conditions, or the presence of light. 

15 Examples of promoters under developmental control include promoters that 

initiate transcription only in certain tissues, such as leaves, roots, fruit, seeds, or flowers. 
The operation of a promoter may also vary depending on its location in the genome. 
Thus, an inducible promoter may become fully or partially constitutive in certain 
locations. 

20 The endogenous promoters from the RRK genes of the invention can be used 

to direct expression of the genes. These promoters can also be used to direct expression of 
heterologous structural genes. Thus, the promoters can be used in recombinant expression 
cassettes to drive expression of genes conferring resistance to any number of pathogens, 
including fungi, bacteria, and the like. 

25 To identify the promoters, the 5 f portions of the clones described here are 

analyzed for sequences characteristic of promoter sequences. For instance, promoter 
sequence elements include the TATA box consensus sequence (TATAAT), which is 
usually 20 to 30 base pairs upstream of the transcription start site. In plants, further 
upstream from the TATA box, at positions -80 to -100, there is typically a promoter 

30 element with a series of adenines surrounding the trinucleotide G (or T) N G. J. Messing 
et ah, in Genetic Engineering in Plants, pp. 221-227 (Kosage, Meredith and Hollaender, 
eds. 1983). 
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If proper polypeptide expression is desired, a polyadenylation region at the 
3 '-end of the RRK coding region should be included. The polyadenylation region can be 
derived from the natural gene, from a variety of other plant genes, or from T-DNA. 

The vector comprising the sequences from an RRK gene will typically 
5 comprise a marker gene which confers a selectable phenotype on plant cells. For 

example, the marker may encode biocide resistance, particularly antibiotic resistance, such 
as resistance to kanamycin, G418, bleomycin, hygromycin, or herbicide resistance, such 
as resistance to chlorosluforon or Basta. 

Such DNA constructs may be introduced into the genome of the desired 

10 plant host by a variety of conventional techniques. For example, the DNA construct may 
be introduced directly into the genomic DNA of the plant cell using techniques such as 
electroporation, PEG poration, particle bombardment and microinjection of plant cell 
protoplasts or embryogenic callus, or the DNA constructs can be introduced directly to 
plant tissue using ballistic methods, such as DNA particle bombardment. Alternatively, 

15 the DNA constructs may be combined with suitable T-DNA flanking regions and 

introduced into a conventional Agrobacterium tumefaciens host vector. The virulence 
functions of the Agrobacterium tumefaciens host will direct the insertion of the construct 
and adjacent marker into the plant cell DNA when the cell is infected by the bacteria. 

Transformation techniques are known in the art and well described in the 

20 scientific and patent literature. The introduction of DNA constructs using polyethylene 
glycol precipitation is described in Paszkowski et aL Embo J. 3:2717-2722 (1984). 
Electroporation techniques are described in Fromm et aL Proc. NatL Acad. ScL USA 
82:5824 (1985). Ballistic transformation techniques are described in Klein et aL Nature 
327:70-73 (1987). Using a number of approaches, cereal species such as rye (de la Pena 

25 et aL , Nature 325:274-276 (1987)), corn (Rhodes et aL , Science 240:204-207 (1988)), 

and rice (Shimamoto et aL , Nature 338:274-276 (1989) by electroporation; Li et aL Plant 
Cell Rep. 12:250-255 (1993) by ballistic techniques) can be transformed. 

Agrobacterium tumefaciens-meditated transformation techniques are well 
described in the scientific literature. See, for example Horsch et aL Science 233:496-498 

30 (1984), and Fraley et aL Proc. NatL Acad. Sci. USA 80:4803 (1983). Although 

Agrobacterium is useful primarily in dicots, certain monocots can be transformed by 
Agrobacterium. For instance, Agrobacterium transformation of rice is described by Hiei et 
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al, Plant J. 6:271-282 (1994). 

Transformed plant cells which are derived by any of the above 
transformation techniques can be cultured to regenerate a whole plant which possesses the 
transformed genotype and thus the desired RRK-contxolled phenotype. Such regeneration 
5 techniques rely on manipulation of certain phytohormones in a tissue culture growth 

medium, typically relying on a biocide and/or herbicide marker which has been introduced 
together with the RRK nucleotide sequences. Plant regeneration from cultured protoplasts 
is described in Evans et aL , Protoplasts Isolation and Culture, Handbook of Plant Cell 
Culture, pp. 124-176, MacMillilan Publishing Company, New York, 1983; and Binding, 

10 Regeneration of Plants, Plant Protoplasts, pp. 21-73, CRC Press, Boca Raton, 1985. 

Regeneration can also be obtained from plant callus, explants, organs, or parts thereof. 
Such regeneration techniques are described generally in KJee et aL Ann. Rev. of Plant 
Phys. 38:467-486 (1987). 

The methods of the present invention are particularly useful for 

15 incorporating the RRK polynucleotides into transformed plants in ways and under 

circumstances which are not found naturally. In particular, the RRK polypeptides may be 
expressed at times or in quantities which are not characteristic of natural plants. 

One of skill will recognize that after the expression cassette is stably 
incorporated in transgenic plants and confirmed to be operable, it can be introduced into 

20 other plants by sexual crossing. Any of a number of standard breeding techniques can be 
used, depending upon the species to be crossed. 

The effect of the modification of RRK gene expression can be measured by 
detection of increases or decreases in mRNA levels using, for instance, Northern blots. In 
addition, the phenotypic effects of gene expression can be detected by measuring lesion 

25 length as in plants. Suitable assays for determining resistance are described in USSN 
08/587,680. 

The following Examples are offered by way of illustration, not limitation. 



Example 1 

30 As noted above, Xa21 genes make up a multigene family. Pulsed field gel 

electrophoresis and genetic analysis have demonstrated that most of the members of the 
Xa21 gene family are located in a 230 kb genomic region on chromosome 11 linked to at 
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least 8 major resistance genes and 1 QTL for resistance (Song, et al., Science 270:1804 
(1995); Ronald, etal, MoL Gen. Genet. 236:113 (1992). 

This example describes six Xa21 gene family members from the resistant 
rice line IRBB21, which members are designated Al, A2, C, D, E, and F. Cloning was 
5 as described in USSN 08/587,680; Song, et al. , supra and Wang, et al , Plant J. 7, 525 
(1995). DNA sequences were determined by using the Sequitherm Long Read Cycle 
Sequencing Kit (Epicentre Technologies) in combination with the LI-COR Model 4000L 
Automated Sequencer (LI-COR Inc). To fill in gaps, a primer walking strategy was 
performed using synthesized primers (Operon) and the Applied Biosystems 373 DNA 

10 sequencer. Genebank accession numbers are as follows: Al: U72725 (SEQ ID NO: 4); 
A2: U72727 (SEQ ID NO: 10); C: U72723 (SEQ ID NO: 6); D: U72726 (SEQ ID NO: 
1); E: U72724 (SEQ ID NO: 8); F: U72728 (SEQ ID NO: 12); 3* flanking region of F: 
U72729 (SEQ ID NO: 12). The Wisconsin sequence analysis programs GAP and Pileup 
were used to calculate the percent identity and to carry out multiple alignments of DNA 

15 and protein sequences, respectively. 

Sequence data and restriction enzyme analysis of cosmid and bacterial 
artificial chromosome clones indicated that the seven members are contained on 4 clones 
(Fig. 1). The first clone, carrying Xa21 (described in USSN 08/587,680 and Song et al, 
supra. The Genbank accession number for Xa21 genmomic and cDNA sequences is 

20 U37133) and member C, spans a 40 kb region; the second clone includes member D, Al, 
and A2 and occupies a 150 kb region; clones of 40 kb and 130 kb contain members E and 
F, respectively. Genetic and molecular data suggests member E is inherited from the 
susceptible parent IR24 (P.C. Ronald, et al., MoL Gen. Genet. 236, 113 (1992)). 

The entire coding region, the intron, and 3' flanking region of the seven 

25 family members can be grouped into two classes. One class (designated the Xa21 class) 
contains Xa21, as well as members D and F (SEQ ID NOs: 1 and 12). The second class 
(designated the A2 class) contains members Al (SEQ ID NO:4), A2 (SEQ ID NO: 10), C 
(SEQ ID NO:6), and E (SEQ ID NO:8). Within each class, family members share 
striking nucleotide sequence identity (98.0% average identity for the members of the Xa21 

30 class; 95.2% average identity for the members of the A2 class); compared to low levels of 
DNA sequence identity between members of the two classes (eg. 63.5% identity between 
Xa21 and A2) (Fig. 2A). Only the Xa21 and Al open reading frames (ORFs) encode 
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receptor kinase-like proteins. The sequence of other family members contain alterations 
causing a premature truncation of the predicted receptor kinase-like ORF (small deletions 
in F and C; base pair mutations in A2; or transposon insertions in D and E). At the amino 
acid level, Al and XA21 share 68.6% identity overall. As shown in Figure 2B, Domains I 

5 and II, carrying the presumed signal peptide and amino terminus of the protein, are 100% 
identical whereas the LRR domain (domain III) of XA21 and Al share a low level of 
identity (59.5%) and differ in the number of LRRs (23 vs 22 respectively). In the 
presumed intracellular portion, the catalytic domains (domain VIII) of XA21 and Al are 
highly conserved (82% identity), whereas the non-catalytic regions are divergent (64% 

10 identity for domain VII (juxtamembrane) and 38.5% identity for domain IX (carboxyl 
terminus)). The differences observed between members of the two classes suggest that 
they may differ in function. Indeed, we have found transgenic plants containing the Al 
sequence are susceptible to all Xoo isolates tested. 

A remarkable feature of the Xa21 family members is the presence of 

15 fourteen transposable element-like sequences (M.A. Grandbastien, et al, Nature 337: 376 
(1989); S.E.; White, etal, Proc. Natl. Acad. Sci. U.S.A. 91: 11792 (1994)). The 
position of these elements is shown in Fig 1 . Twelve elements insert into noncoding 
regions; whereas two elements, named Retrofit and Truncator, integrate into the coding 
regions of members D and E, respectively, resulting in disruption of the ORFs of these 

20 two members (Fig. 1, number 9 and 13). Retrofit (SEQ ID NO:3) belongs to the 

Drosophila copia class of retrotransposons and carries a large ORF showing greatest 
similarity to the ORF of maize Hopscotch (68.6% similarity; 54.6% identity) and tobacco 
Tntl (51.4% similarity; 31.9% identity) (M.A. Grandbastien, etal. 7 Nature 337: 376 
(1989); S.E.; White, et ah, Proc. Natl Acad. ScL U.S.A. 91: 11792 (1994)). The 

25 insertion site of this element is located between the 23rd (V) and 24th (P) amino acids of 
the 22nd LRR creating a truncated molecule, lacking the transmembrane and kinase 
domains (Fig. 3A). Insertion of Retrofit into a presumed coding region contrasts with the 
observation in yeast and maize that integration of retrotransposons is biased towards 
noncoding regions (D.F. Voytas, Science 274: 737 (1996); P. SanMiguel, etal., Science 

30 274: 765 (1996)). The fact that the truncated D confers partial resistance to Xoo suggests 
that transposition events at the Xa21 locus can alter expression of resistance. 

Truncator, 2913 bp, represents a novel transposon-like sequence carrying 9 
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bp terminal inverted repeats (TIRs). The sequence shows no significant homology to any 
sequence in the database and contains no obvious ORFs. Interestingly, insertion of this 
element into the amino terminus of the kinase domain of member E would presumably 
result in premature truncation of the receptor kinase resulting in a receptor-like molecule 
5 structurally similar to the tomato fungal resistance gene products Cf9 and Cf2 (Fig. 3B) 
(D.A. Jones, etal, Science, 266: 789 (1994); M.S. Dixon, etal, Cell 84:451 (1996)). 

In addition to the transposition events presented above, recombination 
between different family members was also found to play an important role in the 
evolution of the Xa21 locus. A 269 bp highly conserved (HC) region, located immediately 

10 downstream of the start codon of all seven family members marks the site of intragenic 
recombination events (Fig. 2A). The HC region, has a high G/C content (61.8% for 
Xa21) hallmarked by the typical G/C rich restriction enzyme recognition site Not I. At the 
amino acid level, the HC region spans domain I and domain H of XA21 and shares nearly 
100% identity among seven family members. 

15 The HC region delimits four classes of DNA sequences (-1.3 kb) upstream 

of the HC region. The 5* flanking region of family member F is divergent from that of 
other family members (less than 40% identity). The precise breakpoint (from sequence 
similarity to divergence) between Xa21 and F is located within the HC region, 120 bp 
downstream from the start codon. This sudden change of sequence identity is unlikely due 

20 to random events such as transposon insertion or deletion because such events would 

presumably lead to an altered coding region. This is not the case; the deduced amino acid 
sequence of F maintains the receptor kinase like ORF. These results suggest that a 
recombination event occurred in the HC region resulting in the formation of a chimeric 
sequence containing the 5' flanking region of F and a downstream region (including 

25 coding region, intron, and 3' flanking region) of the Xa21 class. 

In further support of the idea that the HC region mediates intragenic 
recombination, we also observed apparent recombination breakpoints near or within the 
HC region for gene family members E, Al, and C. For E, the 5' flanking region is 
divergent from all other members whereas the 3' downstream regions belong to the A2 

30 class. The sudden change of DNA identity can be explained by a recombination event 
between a progenitor A2-type gene and an unknown family member. The likely 
recombination breakpoint in E is located 105 bp upstream of the HC region since 
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sequences upstream of this site are quite different, compared with a high level of DNA 
sequence identity downstream of this site. 

The nearly identical DNA sequences of C and Al provide the most striking 
example of an HC mediated recombination event. For example, the 5' flanking region of 

5 C shows nearly perfect identity (99.2%) to that of Xa21 9 whereas the downstream region 
of C belongs to the A2 class. The high level of identity between the 5* flanking sequences 
of Xa21 and C extends 3.8 kb upstream. This upstream region includes the functional 
promoter for the Xa21 gene (W.-Y. Song, et al. , Science 270:1804 (1995)). These results 
strongly suggest that C was created by a recombination event in the HC region between 

10 progenitors of the Xa21 and A2 classes. The likely recombination breakpoint in member 
C is delimited by two characteristic deletions: one is located at position -37 and is only 
present in Xa21 class members (Xa21, D, C, and Al); another deletion is located at 
position 255 and occurs in all A2 class members. 

From these results it is clear that we have identified a highly conserved, 

15 G/C rich region in the gene family and that this region appears to be involved in high 

frequency recombination between family members. Not only is the HC region present in 
O. longistaminata, but is also present in Xa21 family members of the cultivated rice 
species O. sativa (The clone RG103, spanning the HC region of an Xa21 gene family 
member was isolated from 0. sativa cultivar IR36 (3, S. Mcouch, et al, Theoret. AppL 

20 Genet. 76:815 (1988)). Genebank accession number of RG103 is U82168. The 

mechanism for HC region-mediated recombination is unknown; however, two models can 
be envisioned. First, this region may mediate programmed recombination similar to that 
observed in African trypanosomes (R.H.A. Plasterk, Trends Genet 8, 403 (1992)). In 
trypanosomes, antigenic variation is controlled by a variant surface glycoprotein (VSG), 

25 which is encoded by a member of a multigene family containing more than 1000 members. 
Recombination at stretches of highly conserved nucleotides between silent and expressed 
members of the VSG gene family leads to expression of new antigens. Alternatively, HC 
mediated recombination may be an example of an ectopic recombination event where the 
HC region serves as a recombination initiation site (T.D. Petes, et al., Annu. Rev. Genet. 

30 22:147 (1988); A. Nicolas, et al. , Nature 338: 35 (1989)). Frequent recombination in this 
region would maintain the conservation of the HC region but allow flanking sequences to 
diverge. Over time, mismatch repair would lead to homogenization of the HC region and 
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result in an overall increased G/C content as has been observed in yeast (Brown T., et al. , 
CW/54, 705(1988)). 

Evidence for recombination in intergenic regions of the Xa21 family 
members was also observed. First, sequences in the 5' flanking region of members C and 
5 Xa21 are identical for 3.8 kb and then abruptly diverge. Interestingly, the same site of 
divergence is observed in the 3' flanking regions of Xa2J and member F (Fig. 4). The 
presence of a conserved site of divergence suggests not only that this is a recombination 
breakpoint but that the XalllQ cluster and member F are generated from the same 
progenitor. Second, the sequence of a 14742 bp region spanning the XalllQ cluster 
10 shows 97.7% identity to the corresponding sequence (14871 bp) of the D/A1/A2 cluster 
(Fig. 1), suggesting these regions evolved through sequence duplication. This duplication 
process can be explained by a presumed unequal cross-over event in the intergenic region 
of these two clusters. 

15 Example 2 

Using PCR amplification techniques as described in USSN 08/587,680, 
Xa21 genes were isolated from cassava (SEQ ID NOS: 13-14), maize (SEQ ID NO: 15- 
16) and tomato (SEQ ID. NOs: 17-29). The following is a description of the methods 
used to isolate TRK1-7 from tomato. The same general procedure was used for maize and 

20 cassava. 

We designed primers in conserved regions of both the Leucine Rich Repeat 
(LRR) region and the serine-threonine kinase domain of Xa21. The PCR 
products should amplify between these two domains and therefore span the transmembrane 
domain. So far, two sets of primers have proven successful to amplify three homologues 

25 of Xa21 in tomato. 

The first clone TRK1 is a cDNA and the encoded polypeptide (SEQ ID 
NOs: 17and 18). This clone is present as one or two copies in the tomato genome and one 
copy maps to the short arm of chromosome 1 in the proximity of a resistance gene to 
Xanthomonas campestris pv. vesicatoria (Rxl)(Zu et al. (1995) Genetics 41:675-682). 

30 The second clone TRK2 (SEQ ID NO: 19) is a 496bp PCR product with an 

ORF encoding a polypeptide (SEQ ID NO:20). TRK2 maps within a few cM of men 
(figure 4) a mutation on chromosome 3 that mimics disease lesions. A third clone TRIG 
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(SEQ ID Nos: 21 and 22) is a 473bp fragment and maps to chromosome 8 near an erecta 
like mutant. TRK4-7 (SEQ ID Nos: 23-29) are further PCR products and encoded 
polypeptides 

Primers that have been proven useful are as follows. 
5 1. LRR region 

L3a. TCA AGC AAC AAT TTG TCA GGN CA(A/G) AT(A/C/T) CC 

2. Kinase region 

Kla CGC CTT AGG ATT TTC AAG CTT TCC (T/C)TT (G/A)TA NAC 
10 K2a. TAA CAG CAC ATT GCT TGA TTT NAN (G/A)TC NCG (G/A)TG 

K2b. TAA CAG CAC ATT GCT TGA TTT NAN (G/A)TC (G/A)CA (G/A)TG 

K2c. TAA CAG CAC ATT GCT TGA TTT NAN (G/A)TC (T/QCT (G/A)TG 
The following combinations of primers are preferred: 

L3a+Kla then L3u+Klu 
15 L3a+K2a then L3u+K2u 

L3a+K2b then L3u+K2u or 

L3a+K2c then L3u+K2u. 

PCR conditions 

first cycle 
20 94 for 30 s 
55 for 30 s 
72 for 1 min 

For the next 19 cycles, the annealing temperature drops ldegree C every 
cycle. After 20 cyles, 10 min at 72. After inital amplification as second round of 
25 amplification is performed with the following specific primers with 1 microliter of the 
previous PCR. 

L3u. TCA AGC AAC AAT TTG TCA 
Klu. CGC CTT AGG ATT TTC AAG CTT 
K2u. TAA CAG CAC ATT GCT TGA 
30 The conditions for this amplification are: 

35 cycles 
94 15 sec 
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55 15 s 
72 1 mn 

after 35 cyles, 72 for 10 min 

The above examples are provided to illustrate the invention but not to l imi t 
its scope. Other variants of the invention will be readily apparent to one of ordinary skill 
in the art and are encompassed by the appended claims. All publications, patents, and 
patent applications cited herein are hereby incorporated by reference. 
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WHAT TS PT A TM FP TS' 

1. An isolated nucleic acid construct comprising an RRK polynucleotide 
sequence, which polynucleotide hybridizes to SEQ ID NOs: 1, 4, 6, 8, 10, or 11 under 1 
5 stringent conditions. 



2. The nucleic acid construct of claim 1, wherein the RRK polynucleotide 
sequence encodes an RRK polypeptide having an leucine rich repeat motif. 

3. The nucleic acid construct of claim 1 , wherein the RRK polynucleotide 
sequence encodes an RRK polypeptide having a cytoplasmic protein kinase domain. 



4. The nucleic acid construct of claim 1, wherein the polynucleotide 
sequence is a full length gene. 

15 

5. The nucleic acid construct of claim 1, wherein the Xa21 
polynucleotide is as shown in SEQ ID NOs: 1, 4, 6, 8, 10, or 11. 



6. The nucleic acid construct of claim 1 , further comprising a promoter 
20 operably linked to the RRK polynucleotide sequence. 

7* The nucleic acid construct of claim 1 , wherein the promoter is a 
tissue-specific promoter. 

25 8. The nucleic acid construct of claim 1 , wherein the promoter is a 

constitutive promoter. 



9. An isolated nucleic add construct comprising a cassava RRK 
polynucleotide sequence, which polynucleotide hybridizes to SEQ ID NO: 13 under 
30 stringent conditions. 



10. The isolated nucleic acid construct of claim 9, which is SEQ ID NO: 
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13. 

11. An isolated nucleic acid construct comprising a maize RRK 
polynucleotide sequence, which polynucleotide hybridizes to SEQ ID NOs: 15 or 16 under 
stringent conditions. 

12. The isolated nucleic acid construct of claim 11, which is SEQ ID NO: 
15 or SEQ ID NO: 16. 

13. An isolated nucleic acid construct comprising a tomato RRK 
polynucleotide sequence, which polynucleotide hybridizes to SEQ ID NOs: 17, 19, or 21 
under stringent conditions. 

14. The isolated nucleic acid construct of claim 13, which is SEQ ID NO: 
17, SEQ ID NO: 19, or SEQ ID NO:21. 

15. A transgenic plant comprising a recombinant expression cassette 
comprising a plant promoter operably linked to a Xa21 polynucleotide sequence of claim I. 

16. A method of enhancing resistance to Xanthomonas in a plant, the 
method comprising introducing into the plant a recombinant expression cassette comprising a 
plant promoter operably linked to an RRK polynucleotide sequence of claim 1. 

17. The method of claim 16, wherein the plant tissue is from rice. 

18. The method of claim 16, wherein the plant tissue is from tomato. 
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Sequence Listing 



10 



15 



20 



25 



30 



DEFINITION 

ACCESSION 

SOURCE 

ORGANISM 



COMMENT 
microsatell 
FEATURES 

source 



CDS 



Oryza iongistaminata receptor-like protein, family member D, and 

retrofit (gag/pol) genes, complete cds . 

U72726 

long-staminate rice. 
Oryza iongistaminata 

Eukaryotae; mitochondrial eukaryotes; Viridiplantae; 
Charophyta/Embryophyta group; Embryophyta; Magnoliophyta; 
Liliopsida; Poales; Poaceae; Oryza. 

U72725 and U72726 are separated by a large AT rich 
ite region. 

Location/Qualifiers 
1. .13341 

/organism= "Oryza Iongistaminata " 
/ s t r ain= " IRBB2 1 n 
/ chr omo s ome ="11" 
/map= M llq, RG103" 
2367. .4205 

/note= ,, Xa21 gene family member D 
/ c odon__s t ar t = 1 

/product= "receptor kinase-iike protein" 



misc_f eature 
gene 
CDS 



4201. .9071 

/note="retrof it , a copia-like, transposon-like element" 

4484. .8821 

/gene= "gag/pol " 

4484 . . 8821 

/gene= "gag/pol " 

/ codon_s t ar t = 1 

/product = " ret rof it" 



35 



40 



intron 

misc_f eature 

3 1 flanking 
misc feature 



9915. .11712 
10020 . .10975 

/note="Krispie, transposon-like element" 
12114. .13341 
12626. .12750 

/note="Pop-Ol2, transposon-like element" 
13040. .13248 

/note="Ds-rice2, transposon-like element" 



1 aagcttcatt ggtttcttca gttatactta cgtaggtttt tcctgtatac ataaatacgt 
61 aacagagtaa gggaattaga ttgtttaaaa taaaatacat ataatctaat agcctaaaat 
121 atcaggtcca ctgacagtgg cggatctagg atttagaata tgggtggtcc gacctaattt 
181 tttcctaaac atactaaatc taacgatggt aatatatact atgcaagtat agataataga 
241 atagaccaaa agtgtatcat gctatattaa taaagcatct taaaacatat ataattaata 
301 attacctaaa attttgactt aaagaagctc acatggctat aaaagtttaa agaaaattac 
361 catactaatt tttcttctta tcgggtctac gccttctaat ggccatgaaa gtggtcgtta 
421 tatcttcttc cttcactctt aagaaaacat cccgcttaat ggatgtgtct atactatcat 
481 ccaaaagctc atcacccatc ttttttctca accatcatta gtaaatgcat cagttctact 
541 ataatttaat atcacaatgc acaggagtaa agagttcaaa atttcaaaac tgaaaattga 
601 aaaaaaagta aaaaaaaaat agaaaacctt tttgttttgg cttggtgcag gtctgcacca 
„ 661 9tgcgctagt gcggcactgc ggcggcagcg gccaaggtgt cgacgcgcgt gcgtggcccg 

DD 721 gt99cgctcg ctctcacgat ctgatcagat cgctgatcgc gtcggcgtcg cgactcgcga 

781 gggcgaggag gagagcgaca gagagtcctg cgacggcgcg acgcttcggt ttcttaattc 
841 cgaacgatta gatacaccgt acacgcgcgt gtggtgtggg gcctgtggta atctaatggt 
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901 
961 
1021 
1081 
1141 
1201 
1261 
1321 
1381 
1441 
1501 
1561 
1621 
1681 
1741 
1801 
1861 
1921 
1981 
2041 
2101 
2161 
2221 
2281 
2341 
2401 
2461 
2521 
2581 
2641 
2701 
2761 
2821 
2881 
2941 
3001 
3061 
3121 
3181 
3241 
3301 
3361 
3421 
3481 
3541 
3601 
3661 
3721 
3781 
3841 
3901 
3961 
4021 
4081 
4141 
4201 
4261 
4321 



ttaaaatatt 
tggcagccta 
tttaatggac 
ctaaacaatt 
aaacatagct 
tttagatccc 
catccgagat 
caccaccacc 
aaaaccacat 
gattcgtcca 
tttaatgaga 
cgatccaacg 
attcttgatt 
cgcgtatgtt 
tttctttaaa 
tatacactaa 
tcaaacacgg 
gacgaccgcx 
ttccatgccg 
gacaaagagc 
ccgactgcta 
aaaacaaaca 
gcaagttatc 
gcgtgagcac 
gctgtcttgc 
tgttctctgc 
acgaactcgc 
catcttggaa 
gccgccgccg 
ccgggatcat 
acaactactt 
tggagctgag 
agttgacatc 
gtgccagctt 
ttccatccgc 
tatcaggagc 
gacagaacaa 
cgttttgtgt 
ttcacctcct 
cagttgctaa 
ttatcacctc 
tgtttcaaac 
aattacaaac 
ccaatctttc 
ttccgaagga 
tcagagggtc 
acgaaaacaa 
tcttactgct 
caaacttgtt 
tattcaatat 
caataccaca 
gattatcagg 
tgcaaaataa 
aaactcttga 
ttactatgct 
tgtgaaacag 
cgatgctcta 
caccttctta 



gggtccacca 
agagcgtttg 
ttctagtata 
tttctctctc 
agaaatgtaa 
ctcgatttgt 
atttgttgtt 
atcgtgataa 
atgtggtgga 
cgtcaccaag 
gttaaatttt 
acagcttcca 
atgtgtttga 
tccagaactg 
aaatatatta 
cagcttattt 
cattggatgc 
acaacctggt 
aagtgccgac 
atagagtccc 
caccttggtg 
atggtggtaa 
acttgaactt 
tttttcttcc 
cttgcacttc 
gctgctgctc 
gctgctctct 
cacgtccggc 
ccggcaccca 
cncgccgtcg 
ctccggcgag 
cgataactcc 
gctagacctc 
gaaacatctc 
tttgggcaat 
tataccttca 
tctaagtggg 
cagcgaaaac 
cgaggtgata 
tgcttctcat 

ggggtttgga 

tagagaacaa 
attgaacttg 
cacttcgctt 
tattggcaat 
tcttccatca 
cttgagcggt 
cggcaccaac 
gtcattaggc 
tcaaacacta 
agaaataggg 
taaaatccct 
tttgttatct 
tctctcaagc 
tcattccttg 
tctgttctat 
gtactgggat 
tccattgtaa 



atttaagtga 

taggagtccc 

taatagatat 

atcgtatttc 

atgttcacct 

ataatataac 

tttgccgatc 

tgcatcttgt 

aacttagaaa 

agataaaatt 

aactcatgtt 

agtttccact 

gagcagctag 

ttaaatggtg 

atctattttt 

cgttctacgt 

tctcatagca 

gtgttatatc 

gatgagaccg 

taccatagta 

tgtaataata 

cagtaaatct 

cttaatactc 

atataatctg 

tgcacgatga 

tgcccttcaa 

ttcaagtcat 

cacggccagc 

cacagggtgg 

ctcggcaacc 

ataccaccgg 

atccaaggga 

agccacaacc 

tcgaatttgt 

ctcactagcc 

tcactggggc 

atgatcccca 

aagctaggtg 

tatatgggca 

ctgacacggc 

aggttaagaa 

gaagattggg 

ggagaaaata 

agttttcttg 

cttattggct 

tcgttgggca 

tcgatcccgt 

aaattcagtg 

ctttcaacta 

tcaataatga 

catctcaaaa 

aacacgcttg 

ggtagcatcc 

aacaatttgt 

aacctttctt 

tatcctggac 

atggttcgtt 

ctgaatactg 



aaatcgacgg 

acgtggcggt 

ttataatttt 

catatatctt 

tgcatcaata 

caaaatattt 

gagcaaagat 

gtgttattct 

ctaccgttag 

taactcgcag 

gatgtggacg 

acatatgtgg 

cacaaagaga 

tgttttttga 

taagtttaaa 

atcttgtcaa 

cttgctcgtt 

gtgctttgtt 

tgttcgatgc 

cctgctcgcg 

tcgtgttgtg 

gtcatcccac 

catccgtttg 

tctagtccat 

tatcactccc 

gcagtgacga 

ccctgctata 

actgcacatg 

tgaagctgct 

tgtccttcct 

agctctgccg 

gcatccccgc 

aactgcgagg 

accttcacaa 

tccaggagtt 

agctcagcag 

attctatctg 

gtatgatccc 

ctaaccgttt 

ttcagattga 

atctcacaga 

ggttcatttc 

acctgggggg 

cacttcattt 

tacaacatct 

ggcttaaaaa 

tggccatagg 

gttggatacc 

ataaccttag 

tcaatgtatc 

atctagtaga 

gtgattgcca 

catcagcctt 

caggccagat 

tcaacagctt 

atgagatata 

aggataagga 

taagccacgg 



ttagatatga 
ttgagagcgt 
attaagtacc 
tttgagataa 
ggggatgaag 
tcaccaaaaa 
tagtagtcca 
tgatgagaaa 
atcgagaaat 
attcacttat 
aatatcggac 
tttgcactat 
aaaaaaagca 
aaaaactttc 
ataattacta 
ttttcgctat 
cggatagaag 
tagcataatc 
atctttgtat 
tagaagactt 
tgtaccatgc 
ccactctcat 
cgtgtgttct 
gagctaaacc 
attattgctc 
cgatggtgat 
ccaggggggc 

ggtgggtgtt 

gctgcgctcc 
cagggagctg 
tctcagcagg 
ggccattgga 
tatgatccca 
aaatggtttg 
tgatttgagc 
tctattgaat 
gaacctttcg 
tacaaatgca 
ccatggcaaa 
tggcaacttg 
actgtatctc 
tgacctaaca 
agttcttcct 
gaataagatc 
ctatctctgc 
cttaggcatt 
aaatcttact 
atacacactc 
tggtccaata 
aaaaaataac 
atttcatgca 
gctcttacgg 
gggtcagctg 
acccacatcc 
tgtgggggaa 
catgcatgct 
gattatctcc 
tagaggctgt 



tagagctacg 

ttgtaggaag 

ctaattttcc 

taatggatat 

ttgctaacct 

tttcgttaaa 

gcagtgtctg 

atacgtagtg 

ggatgtccaa 

gagttaaaat 

atccatttct 

atattttccc 

tcgtttttca 

tatagaaaag 

cttaattaat 

tcctttcttc 

acttgacgaa 

attacatata 

ggcatctagg 

gacgagaaga 

atactccttt 

tgtaaatttt 

ttcagaattt 

agcatctctc 

ttcgtcctgt 

gctgccggcg 

cagtcgctgg 

gtgtgcggcc 

tccaacctgt 

gacctcggcg 

cttcagctgc 

gcatgcacca 

cgtgagattg 

tcaggagaga 

ttcaacagat 

atgaatttgg 

tctctaagag 

ttcaaaaccc 

atccctgcct 

ttcagtggaa 

tggagaaatt 

aattgctcca 

aattcgtttt 

acaggaagca 

aacaacaatt 

ctactcgcct 

gaacttaata 

tcaaacctca 

cccagtgaat 

ttggagggat 

gaatcgaata 

catctttatc 

aaaggtctcg 

ttagcagata 

gtgccaacca 

ccagtacatg 

aaatttgttt 

ttctaccact 
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4381 atttaacaga gatgcggccc aaggcaaagg gtttaacgct tcacatttta tcacatggta 
4441 tcagagccrt tttccacctc aatagatcgc atctttcatc cacatggcgt cgtcgtcgtc 
4501 ttcctcaggc gcggcagccg ccaatctcct ccaaggccac tcggtttcag agaaactcgg 
4561 gaaagccaac catgcattgt ggaaagcgca agttagcgct gcagtgcgtg gagcccgatt 
4621 gctgggctac ctcaacggcg atatcaaagc tccagacgcc gaactctcgg tcaccataga 
4681 tgggaagacc acaacaaagc cgaatccggc atttgaagat tgggaggcca atgaccagct 
4741 tgttcttggc tatctcctgt catctctttc aagggatgtg ctgatccaag tcgccacatg 
4801 caagacggcg gctgaggcat ggcggagcat tgaagcactc tactccaccg gcactcgagc 
4861 aagggcggcg aacaccagac tcgccctcac caacacgaag aaaggaacaa tgaagatcgc 
4921 cgagtatgtc gccaagatgc gagcgcttgg tgatgagatg gctgccggcg gtcatccact 
4981 tgatgaagaa gaccttgtcc agtacatcat cgctgggcta aatgaagact tcagcccgat 
5041 cgtctccaac ctctgcaaca agtccgatcc catcacggtt ggggagctgt attctcagct 
5101 cgtcaacttr gaaaccctcc ttgatctcta ccgcagcact ggtcagggag gagctgcttt 
5161 tgtcgctaat cgcggcaggg gcggcggcgg cggcgggcgc ggcaacaaca acaactccgg 
5221 cggcggcggc ggcagaagcg cgccgggtgg acgcggcagc ggcagccagg gtcgcggtgg 
5281 ccgtggacgc ggcacaggag gccaagacag gcgccctact tgccaagttt gtttcaagcg 
5341 tgggcataca gcagctgatt gttggtatcg cttcgacgag gactacgttg cagatgagaa 
5401 gctcgttgct gctgctacta actcgtatgg tatagataca aattggtata ttgatacagg 
5461 tgctacagac cacattaccg gtgaactaga gaagcttacc accaaggaga aatacaacgg 
5521 cggcgagcaa attcacactg ctagcggagc aggtatggat attagtcaca ttggtcatac 
S581 tattgtgcat acccctagcc gtaatattca tctaaacaat gtcctttatg ttcctcaagc 
5641 caagaaaaat cttatatctg ctagtcaatt agccgctgat aattctgctt ttcttgaact 
5701 tcactcgaaa ttcttttcta taaaggatca ggtaacgagg gacgttctgc ttgaagggaa 
5761 atgtagacac ggtctctacc cgatccccaa gttctttggt cgctcaacca acaaacaagc 
5821 ccttggtgcc gccaagttat ccctgtctag gtggcatagc cgtctaggac atccgtctct 
5881 tcctattgrc aagcaagtca ttagcagaaa taatctccca tgttcagttg agtcagtcaa 
5941 tcagtctgtg tgtaatgctt gccaagaagc aaagagtcat cagttacctt atattagatc 
6001 tactagtgtg tctcaatttc ctcttgaact tgttttttct gatgtttggg gccctgctcc 
6061 agagtctgtt gggagaaata aatattatgt gagtttcatt gatgatttta gtaagtttac 
6121 ttggatatac ttgctgaaat acaagtctga ggtttttgag aaatttaaag aatttcaggc 
6181 tttagttgaa cgaatgtttg atagaaagat tattgccatg cagactgatt ggcggggggg 
6241 gagatatcag aaacttaatt ccttttttgc tcaaatagga ttgatcatca tgtgtcatgt 
6301 cctcacaccc atcaggcaga atgggtcagc tgagagaaaa caccggcata tcgtggaagt 
6361 aggcctttct cttttatctt atgcatcaat gcctcttaag ttttgggatg aagcctttgt 
6421 tgcagccact tatctcatca atcgtatacc tagtaaaacc atccaaaatt ctacacccct 
6481 agagaaactg tttaaccaaa aacctgacta ctcatccttg agagtgtttg gttgtgcatg 
6541 ttggcctcat cttcgccctt acaatacaca caaactccag tttcgctcca aacagtgcgt 
6601 gtttttgggt tttagtactc accacaaagg atttaagtgt cttgatgtgt catcaggccg 
6661 tgtctacatc tcaagagatg ttgtctttga tgaaaatgtt tttcccttct ctacactcca 
6721 ctcaaatgca ggagccagac tcaggtctga aattcttttg ttgccgtccc ccttgacaaa 
6781 ctataatacg gctagtgcag ggggaacaca tgtagttgca ccagtggcta atactccatt 
6841 acctagtgat aatttaattt ctaatgctgc tgatgtgact tctggagaaa atagtgcagc 
6901 acatgaacag gaaatggaga atgagcagga aatagagaac gtcatgcatg ggaacgacgt 
6961 gcatggggac gcggcatcgg gacctgtgct ggatcaacca actgctgaca gcagcactgc 
7021 gccggaccag ggagctgaca ccagtgacgc ggtctctggc gcagcttctg acgcgggtgg 
7081 agacactgcc accctgggag ctggagcagc aaatagcgca gcagcaggtg gtgaagaatc 
7141 ccagccggtg cagcctgatg tgacgggtac agtactggct acagtagccc ctgcatcgag 
7201 accacacact cgtctgcgga gtggtattcg aaaagagaag gtatacactg atggcaccgt 
7261 taaatatggt tgtttttctt ctactggtga accacaaaat gataaagagg ctttaggaga 
7321 taaaaactgg agagatgcaa tggaaactga gtataatgct ttgataaaaa atgacacatg 
7381 gcacctagtt ccatatgaga aaggacaaaa tatcattggg tgtaaatggg tatataagat 
7441 taaaaggaag gcagatggga cacttgatag atacaaagct agacttgtag caaaggggtt 
7501 taaacaaaga tatggtatcg attatgaaga tacttttagt cctgttgtta aagctgctac 
7561 tattagaatt attctgtcca ttgctgtctc tagaggttgg agtcttagac agttagatgt 
7621 tcagaatgcc tttcttcatg gattcttaga agaagaagtc tacatgcaac aacctcctgg 
7681 gtttgagtca tcctctaaac ctgattatgt atgtaaattg gataaggcat tatatgggct 
7741 gaaacaagca ccaagggcgt ggtattccag gctgagtaag aaacttgttg aacttggttt 
7801 tgaagcttca aaggctgata cctcattatt ctttcttaac aaaggaggga tacttatgtt 
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7861 
7921 
7981 
8041 
8101 
8161 
8221 
8281 
8341 
8401 
8461 
8521 
8581 
8641 
8701 
8761 
8821 
8881 
8941 
9001 
9061 
9121 
9181 
9241 
9301 
9361 
9421 
9481 
9541 
9601 
9661 
9721 
9781 
9841 
9901 
9961 
10021 
10081 
10141 
10201 
10261 
10321 
10381 
10441 
10501 
10561 
10621 
10681 
10741 
10801 
10861 
10921 
10981 
11041 
11101 
11161 
11221 
11281 



tgttttggta 
acttctgaag 
ccttggaatt 
aaatgatctg 
tgttagtgaa 
atatagaagt 
ttcagtaaac 
aaaaagaatc 
tgcttctact 
aaaatcaaca 
gaaacaacct 
tacagccgaa 
agctgccaag 
tcatgcaagg 
gaagctgtta 
ggcactgtca 
attgagaggg 
tccagtacat 
caaatttgtc 
tttctaccac 
atcacaaacc 
actctgtggt 
aaaacatttc 
atcactctac 
ttccatgaaa 
cgcgccgacc 
tatccaagat 
tttcactgcc 
tacaatttgc 
catgcccaac 
gaggcacttg 
ctatcttcac 
gctgttagat 
tgatgggacc 
ctatgcagca 
atatgaaaca 
caggggcgaa 
tatgaacagg 
ttatgactaa 
tcttcacaga 
tgtcagaaaa 
aaattataca 
acaaatcaaa 
ctttcggact 
gataatgtta 
cttcaccctt 
actcgaggac 
agtgctcggt 
ccggccatgt 
tactagtctc 
agtttgcttg 
atcgtcgggg 
aaattgcaca 
atagaggtac 

tgggtgtgtt 
gaggagtttc 
tttttttcat 
acttggcact 



tatgttgatg 
gatctaaaca 
gaggtaacta 
ctaaagagag 
aaattaactc 
atagttggtg 
aaagtctgtc 
ctcagatact 
cttgttcatg 
ggaggatttg 
actgtgtcaa 
ctgatatggg 
atttggtgtg 
acaaagcata 
gagattgatt 
gcttgtcttc 
ctgtgaaaca 
gcgatgctcc 
tcaccttctt 
tatttaacag 
attggtgctt 
ggaatacccg 
ccagttctac 
ttgcttataa 
ggccacccat 
aatttgttgg 
catgttgcag 
gaatgtgaag 
tcgagcattg 
ggcagtctgg 
aatctgcatc 
cgccatggcc 
tctgatatgg 
tcattgatac 
ccaggtcagc 
gtttttacct 
gcagaataaa 
tttatcgtaa 
agatcttttc 
tatagccttt 
aaaaaaccat 
ttaatcctcc 
tgttcatatt 
tgagaaacca 
gtaggttacc 
gccattcttc 
gataagacgc 
caaactacca 
gggctcaagt 
gttcctaatt 
actggggtat 
tctactgacc 
aaaataagat 
agattttttt 
ctgctctaca 
agaaggatca 
atgctagtta 
acttactaat 



atataattgt 
aggagttcgc 
aagtttceaa 
ttaatatgtc 
tatatgaggg 
ctttacaata 
agtttcttca 
tgaaccaatg 
ggtattctga 
cagtattttt 
ggtcaagcac 
tacaaacctt 
ataacttagg 
tagaggttga 
ttgttccatc 
ttgaaaattt 
gtctgttcta 
agtactggga 
atccattgta 
agatgcggcc 
tcgcagctgc 
atctacatct 
ctatttctgt 
cctggcacaa 
tggtctctta 
gttctggatc 
tgaaggtact 
cactacgaaa 
ataacagagg 
aagattggat 
gaagagtgac 
ctgaacctgt 
tagcccatgt 
aacagtcaac 
aagtccttcc 
ctagtgaaac 
tcatcgccgg 
atggccaaaa 
gcttgtgacc 
aagtttaagc 
ttttttgtag 
tccttactca 
tcagaacttc 
ccctagggct 
gccctcaaca 
tcgtccttgt 
ggaaagtcga 
gaagaatatg 
ctcaggcgtt 
gctattgggc 
agctagtaaa 
ccgatggtag 
tatttgccat 
atataggact 
ctgcaatatg 
aatttgagta 
caatttttta 
tcccacatgg 



agctagctct 
acttaaggat 
tggcgttatc 
aaattgcaag 
atcacccttg 
cttgaccttg 
tgctcctact 
cacaagtcta 
tgcagactgg 
gggttctaat 
agaggcagaa 
gttaaaagaa 
agctaaatat 
ttatcatttt 
aggagaccaa 
taaacacaat 
ttatcctgga 
tatggttcgt 
actgaatact 
cgaggcaaag 
atccgggatc 
gcctcgatgt 
ttctctggtc 
gagaactaaa 
ttcgcagttg 
atttggctca 
aaagcttgaa 
tatgcgacat 
gaacgatttc 
acaccctgaa 
catactactt 
tgtacactgt 
tggagatttt 
aagctcgatg 
agtattttgc 
tgatggagaa 
ggtcactact 
cacattttta 
ggtgctacga 
tagccatttg 
tgtgactaaa 
tgatcatata 
tgaattatag 
aaattgtgat 
gctcgagcgc 
tacgccaagc 
gaaatcgaga 
atgcaacagg 
taagagctga 
tttaacaaaa 
aaagcgaggg 
gctgtagcgt 
atctattcag 
ctagagctac 
aaatgattat 
aaattttcaa 
tttcacgagc 
aggtagtgaa 



acagagaagg 
ttgggagacc 
ttgactcaag 
ccagttagta 
ggtcctaatg 
acaagacctg 
accagtcatt 
ggacttcata 
gcaggtagta 
cttgtgtcct 
tataaggctg 
ttgggaattg 
ttatcagcta 
gtaagagaac 
gttgctgacg 
cttaacctag 
catgagatat 
taggataagg 
gtaagccacg 
ggtttaacgc 
tcaatccaag 
tgtccattac 
gcagcactgg 
aagggagccc 
gtaaaagcaa 
gtatacaaag 
aatcctaagg 
cgaaatcttg 
aaagcaattg 
acaaatgatc 
gatgttgcct 
gatattaaat 
gggcttgcaa 
ggatttagag 
attttctgat 
tataagtaat 
aactaatgaa 
caggtggcca 
taatcacatg 
cagaaaatga 
actatgcgta 
actgaagttt 
aaccctaact 
ggtaataatc 
tcgagcgctc 
cggcgatgca 
tgtgcggcga 
ggaatggact 
ttaattttct 
cggtgtcacc 
gtactagatg 
cgcccctgat 
atgctaaata 
cacacactca 
tacttctaca 
ttctacattt 
ttacattgac 
aataatatag 



caactacagc 
tgcactactt 
agaagtatgc 
ctcctctttc 
atgcaataca 
acatagctta 
ggattgcagt 
tacacaagag 
tagatgacag 
ggagtgctag 
tggcaaatac 
agtctcctaa 
atcctgtgtt 
gagtgtcaca 
ggtttacaaa 
ctaggttatg 
acgtgcatgc 
agattatctc 
gtagaggctg 
ttcacatttt 
gcaatgccaa 
tagagaacag 
ccatcctctc 
cttcaagaac 
cagatggttt 
gaaagcttaa 
cgctcaagag 
tcaagatagt 
tgtatgactt 
aagcagacca 
gcgcactgga 
caagcaatgt 
gaatacttgt 
ggacaattgg 
ctctagtgct 
taattgaact 
cttgcactac 
atcgtgtcgc 
gtgaaatgaa 
aaggggtggc 
agatggaaca 
gaaaacaaag 
tcctaataaa 
aagaaattgt 
aaccataacc 
cgcgtgtact 
tgcaatgcga 
ttctgggctg 
attttctact 
ggggtccagt 
tatgcacgaa 
tgaactaatt 
tagctagttc 
aatcaaatta 
tgaactgatg 
aagaaacact 
catgaaaaat 
atacaaaaac 
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11341 gaaatatcct atgttgtgtg atatactata atcacaatga acacaaacag gattcgtaca 
11401 aaagtaatta gccatcatag caactgattg cttggggtaa ctgtatagca caatcatacc 
11461 aaatttcttt agatatgtat ctgtaaatta gattcttaaa gttaaatatg aaatttcatt 
11521 ggtatttatg tttctttata taataaaaat taatccagcc tttgcatcta tcatttgtcc 
5 11581 agacatcctt gttatttgtg atatttaaca cgtaaattta cataattata catccaagtt 

11641 ctttttattt aacactgtaa atttcaaatc gtacatgtta taaagaatgt actatatttc 
11701 ctgctcaaac agagtatggc gttgggctca ttgcatcaac gcatggagat atttacagct 
11761 atggaattct agtgctggaa atagtaaccg ggaagcggcc aactgacagt acattcagac 
n 11821 ccgatttggg cctccgtcaa tacgttgaac tgggcctaca tggcagagtg acggatgttg 

1° 11881 ttgacacgaa gctcattttg gattctgaga actggctgaa cagtacaaat aattctccat 

11941 gtagaagaat cactgaatgc attgtttggc tgcttagact tgggttgtct tgctctcagg 
12001 aattgccatc gagtagaacg ccaaccggag atatcatcga cgaactgaat gccatcaaac 
12061 agaatctctc cggattgttt ccagtgtgtg aaggtgggag ccttgaattc tgatgttatg 
12121 tctcgtaatg ttttattgcc actcttcaga tcgacttctg cagtggtatc taccacacga 
15 12181 tcactaaagt caccgtggtt atttcctgat ccagcatatc tgatcatgca tgttctgtgt 

12241 tgtatacctg cattttactc tgaattgcca caccgcaacc ctgcatctgt ttgtttggta 
12301 tacaaaagat agtgatgagt ttattgtttt aggggcttcc tagttggcgc gtgtggtgcc 
12361 ggcacgcacg cagcccgagg gtgggtttct tttttttcca ttgttattcc gttgcttttt 
12421 ttcaccacgg tagatctttt ttttccggat ttccattttt tccgttgttt ttctctatcg 
^° 12481 cttatgttgg cggatttttt tccgtggttt tctttccgaa gacgagtata tctaacgtaa 

12S41 ctaacatgtt acttttagat aacgatggtt attaagataa gatttttctc tggaagattt 
12601 ttgtaagcaa cagatcgaaa acaaatctat acgtgaggtc aaattttgaa aactttcaat 
12661 ctagatttaa aaacttttca actcaaaatt tgaatttttg aagtgaaaat ttgaaaactt 
■ 12721 tcaaaaatta ccagtaatcg acaaaaaaaa aatggaaatg gaaacggaaa tagttttgct 

^ 5 127 81 gttataccga tcgtttccat atttaccgta ttcttataga aattaccgtt tcttataata 

12841 tggtaattac cgtatttcta aatatgttga tatttatagg gcatgtctct acttgactca 
12901 cagtttagag actgattgac tatttaatca aatccctaac ttgattgcat ggctaaaatg 
12961 gagttgattt ctaatttata tagtatagct tgaatttatt tgtaaatata acatacttat 
13021 gtaaagttaa acatatgttt tctatagttt aatgtttctg tatttgttac cggttttcga 
13081 tctgtaccga catgtttcca tcagtattat tccatgtccg gttttccgat atttccgata 
"13141 tcgttttcgt ttccgacttt accgttttcg atttcatttc cgagaaaaat atgattatgg 
13201 aaatggtcga ggctgttttc cgatcgtttc cgaccgtttt catctctacc cgtagtaata 
13261 atatataaca ttttatctct aatctttctc tctctcatat caatgaaata atcgctaaga 
13321 gactgctatt aacaagggct t 

33 

SEQ ID NO: 2 

/ trans lat ion= " MISLPLLLFVLLFSALLLCPSSSDDDGDAAGDELALLSFKSSLL 
40 YQGGQS LAS WNTS GHGQHCTWVGW CGRRRRRHPHRWKLLLRS SNLS G 1 1 SPS LGNL 

S FLRELDLGDNYFSGE I PPELCRLSRLQLLELSDNS IQGS I PAAIGACTKLTSLDLSH 
NQLRGM I PRE I GAS LKHLSNL YLHKNGLS GE I PS ALGNLTSLQEFDL S FNRLSGA I PS 

45 

S LGQLSS LLNMNLGQNNLSGMI PNS IWNLS S LRAFCVS ENKLGGMI PTNAFKTLHLLE 
VIYMGTNRPHGKIPASVANASHLTRLQIDGNLFSGIITSGFGRLRNLTELYLWRNLF^ 
50 TREQEDWGFI SDLTNCSKLQTLNLGENNLGGVLPNS FSNLSTSLS FLALHIiNKITGSI 

PKDIGNLIGLQHLYLCNNNFRGSLPSSLGRLKNIiGILLAYENNIiSGS I PLAIGNLTEL 
NILLLGTNKFSGWI PYTLSNLTNLLSLGLSTNNLSGPI PSELFNIQTLSIMINVSKNN 



30 



55 



LEGSIPQEIGHLKNLVEFHAESNRLSGKIPNTLGDCQLLRHLYLQNNIiLSGSIPSAI^ 

QLKGLETLDLSSNNLSGQI PTSLADITMLHS LNLSFNSFVGEVPTM 11 
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SEQ ID NO: 3 

5 / trans lation= " MAS SS S S SGAAAANLLQGHSVSE KLGKANHALWKAQVS AAVRGA 

RLLG YLNGD I KAPDAE LSVTI DGKTTTKPNPAFEDWE ANDQL VLGYL LS S LS RDVL I Q 
VATC KTAAEAWRS I E ALYSTGTRARAVNTRLALTNTKKGTMK I AE YVAKMRALGDEMA 

10 

AGGHPLDEEDLVQYI IAGLNEDFS P I VSNLCNKSDP I TVGE LYS QLVNFETLLD LYRS 
TGQGGAAFVANRGRGGGGGGRGNNNNS GGGGGRS APGGRGS GSQGRGGRGRGTGGQDR 
15 RPTCQVCFKRGHTAADCWYRFDED YVADE KLVAAATNSYG IDTNWY I DTGATDHITGE 

LEKLTTKEKYNGGEQIHTASGAGMDISHIGHT^ 

ASQLAADNSAFLELHSKFFS I KDQVTRDVLLEGKCRHGLYPI PKFFGRSTNKQALGAA 

20 

KLSLSRWHSRLGHPSLPIVKQVISRNNLPCSVESVNQSVCNACQEAKSHQLPYIRSTS 
VSQFPLELVFSDVWGPAPESVGRNKYYVSFIDDFSKFTWIYLLKYKSEVFEKFKEFQA 
25 LVERMFDRKI IAMQTDWRGGRYQKLNSFFAQIGLI IMCHVLTLIRQNGSAERKHRHIV 

E VGLS LLS YASMP LKFWDE AFVAATYL I NR I PS KT IQNSTPLEKLFNQKPDYS S LRVF 
GCACWPHLRPYNTHKLQFRS KQC VFLG FSTHHKGFKCLDVS S GRVY I S RDWFDENVF 

30 

PFSTLHSMAGARLRSE ILLLPS PLTNYNTASAGGTHVVAPVANTPLPSDNLISNAADV 
TSGENS AAHEQEMENEQE I ENVMHGND VHGDAASGP VLDQPTADS S T APDQGADTSDA 

35 

VSGAASDAGGDTATLGAGAANSAAAGGEESQPVQPDVTGTVLATVAPASRPHTRLRSG 
IRKEKVYTDGTVKYGCFSSTGE PQNDKEALGDKNWRDAMETE YNAL I KNDTWHLVP YE 
KGQN I IGCKWVYKI KRKADGTLDRYKARLVAKGFKQRYG ID YEDTFS P WKAAT IRI I 

40 

LSIAVSRGWSIJIQLDVQNAFLHGFLEEEVYMQQPPGFESSSKPDYVCKLDKALYGLKQ 

APRAWYSRLS KKLVELGFE AS KADTSLFFLNKGGILMFVLVYVDD 1 1 VASSTEKATTA 

45 LliKDLNKEFALKDLGDLHYFLGIEVTKVSNGVILTQEKYANDLLKRVN^ 

LSVSEKLTLYEGS PLGPNDAIQYRS IVGALQYLTLTRPDIAYSVNKVCQFIiHAPTTSH 

WI AVKRILRYLNQCTS LGLHIHKS ASTLVHGYSDADWAGS IDDRKSTGGFAVFLGSNL 

VSWSARKQPTVSRSSTEAEYKAVANTTAELIWVQTLLKELGIESPKAAKIWCDNLGAK 

YLSMTPVFHARTKHIEVDYHFVRERVSQKLLEIDFVPSGDQVADGFTKALSACLLENF 

KHNLNLARL" 

55 

SEQ ID NO: 4 
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ACCESSION 
SOURCE 

ORGANISM 



COMMENT 



DEFINITION Oryza iongistaminata receptor kinase- like protein gene, family 
member Al, complete cds . 
U72725 

long-staminate rice, 
Oryza Iongistaminata 

Eukaryotae; mitochondrial eukaryotes; Viridiplantae; 
Charophyta/Embryophyta group; Embryophyta; Magnoliophyta; 
Liliopsida; Poales; Poaceae; Oryza. 

U72725 and U72726 are separated by a large AT rich 
microsatellite region. 

FEATURES Location/Qualifiers 

1. -8416 

/organisms "Oryza Iongistaminata" 
/strains "IRBB21" 
/ chromosome = n 1 1 M 
/map="llq, RG103" 
join (4771. .7384,7676. .8052) 

/note= "Xa21 gene family member Al; downstream of 
microsatellite region; disease resistance gene family 
member" 
/codon^startsl 

/product =" receptor kinase -like protein" 



source 



CDS 



misc feature 



7432 . .7614 

/note= "Snap-Oil, transposon-like element" 



BASE COUNT 



ORIGIN 



2220 a 1984 c 1707 g 2505 t 



30 



35 



40 



45 



50 



55 



l 

61 
121 
181 
241 
301 



gtacgccatc gtcgtcatca 
cgatcgccac catcaccaat 
aaccgagaag aacatccgtt 
cgccgccgcc gccacatggc 
tcgacggcgg cggccgcgac 
ctcaggggct ccacggcggc 
cttccgcggc caacccctcc 



gccgtcgatc tgtcattttg aaacggccca ttcttttcca tctatatgca ttcatgaaat 
acatggtata tcccatcgat cggacatcac ctgttagcgc 
acctagctag ggcaaacgca ccttactgag ctccgtcctc 
gaacaagcug ctgcggcctc tcggtggcct gaggttgctc 
ccgatgcttc tcctcctcca tcgatctcgt cttcccaggt 
aaccaccgtg acccacccgc cgccaacgga atccgctggt 
361 tgctgacccg ggctcggtga tgctggaacg ttggggctgc 
421 gaacgtagtc gccgacgaca acaccgcggg agttccgcac 
481 ggtcgcctcg tccgggttgc gccgggatct ccttcatctg cttcgatcgc ggggttgatg 
541 gctacgtcat cgcggctcac ggcgactctg tcctcttccg gatgagttgg aacgactact 
601 tcgtctacat ggccgccgcc ggcaagccgc cgtcgctgac gctgctcccc gtctgcgaca 
661 tccccatgaa cgagcgctgc tgggtcagca aggaccgttt caaggacgct tccgcaccac 
721 gggccgggtg ttcgaccagc aggacaccgg catcctacgc ctccgcggcg acgacggcgg 
781 tgaggaggcg ccgctctagt ggcgcagtcc agatcgcgca cgagccgccg ttcgacacgg 
841 ccgagctctg cgtgctccgc cccggccacg gcgagtggga gctcaagatg gcggtgccca 
901 tcgtccacca tgacagcttg aatcttataa caaaccttgc tgaagctgac aaattctagc 
961 ccccagccat gaagttggaa aaatcaattt ccgattacac aaattggtta atacgcaacc 
1021 atttagtgct cttaacatga ccaggtttta catgttcgtt cggctcttag aatctgacaa 
1081 gaccttatct gctctgggcg tccccagccg aaattccatt agttttctcg gaggcttgtc 
1141 agaacagcgt aaagggacaa taggactgcc ttcaagatga ggcgatataa gaggggatca 
1201 acagacaaat attgcacata taaaacttac agaagttgat gtagatgatg agacgaccac 
1261 cacactaggc aaagaccagg tgtatagttg tactcaacaa atcacagagg tagtgagaga 
1321 tcgctacgat ctactggtcg aagatcaggt gtaggcgtat tctcgatcac ctgaagaaga 
1381 atctttaggt gttgagagat cgctactatc tactggtcaa tactagtaaa aaaacctcat 
1441 agagatcggc actataggtg ccggaacagc taaaaccggc acctataata cttttccctc 
1501 ctccgtggac tcaaagcacg taaaaccgac acctttaagc aactataggt gccggttcta 
1561 aagaagaacc gacacctata gtataggtgc tggtttttta aaaaaacccg acacctttaa 
1621 tataatatag gtgtcggttc ttctttaaaa ccgacaccaa taataaatta tacgtgtcgg 
1681 ttttttaata aaaccggcac ctatccaaac cgagcctagc tgtcgagtcg agccaatcca 
1741 ggctgacgca tatattagtc tcgtccttat cgcctcgtct ctctctctct ctctctctct 
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30 



35 



40 



45 



50 



55 



1801 

1861 

1921 

1981 

2041 

2101 

2161 

2221 

2281 

2341 

2401 

2461 

2521 

2581 

2641 

2701 

2761 

2821 

2881 

2941 

3001 

3061 

3121 

3181 

3241 

3301 

3361 

3421 

3481 

3541 

3601 

3661 

3721 

3781 

3841 

3901 

3961 

4021 

4081 

4141 

4201 

4261 

4321 

4381 

4441 

4501 

4561 

4621 

4681 

4741 

4801 

4861 

4921 

4981 

5041 

5101 

S161 

5221 



ctcttctctc 

gcaacagtgg 

gaataataat 

acatggtcac 

cgtacttaaa 

gatgttttac 

tatgtgtgcc 

ggtattaaat 

tcaacatcac 

gcttctaggt 

atcttctgta 

agcaattcgt 

ttgtcagtca 

ttgtgctcct 

aatttctaga 

aagattatcc 

tgcaaatcca 

tattgcaacc 

ttccaacaag 

tttgcatgga 

atgcacgtca 

catctacatt 

atcatataat 

tcttatttct 

tttataaact 

ttaattatxt 

ttaaagattt 

tttgttgtct 

tttccctaaa 

gatataaaca 

aaccttttag 

ttaaacatcc 

gtctgcacca 

tagtgaaaac 

tccaagatcg 

aaattttaat 

ttctcgatcc 

tcccattctt 

ttcacgcgca 

aaagtttctt 

taattataca 

cttctcaaac 

cgaagacgac 

tatattccat 

tagggacaaa 

aagaccgact 

ctttaaaaca 

ttttgcaagt 

atttgcgtga 

tctcgctgtc 

ctgttgttct 

ggcgacgaac 

ctggcatctt 

ggccgccggc 

atcatctcgc 

cacctgtccg 

ctgaatttca 

tcagttcttg 



tcgcctctct 

tggcgagatg 

cggcatcatg 

agctaagtgc 

cagaacctta 

cactacgacc 

agcgcaccat 

ggaaataccg 

ctaggtcatc 

tctcgtactc 

cttccagtcc 

ttccctcggg 

caccatattt 

attcgcaacc 

ccaacttttt 

gtaccgccgt 

tcatattgag 

cctaactctc 

cggatatgaa 

caacatacaa 

tcaataaact 

aggagataat 

aatttaaaac 

aacgatttta 

ttatctgtct 

ttaaaagttc 

tcaaatggac 

ccttttaaca 

caatttttct 

tagctagaaa 

atctcctcga 

gagatatttg 

ccaccatcgt 

cacatatacg 

tccacgtcac 

gagagttaaa 

aacgatagct 

gattatgtgt 

tgttttcaga 

taaaaaatat 

ctaacagctt 

acggcattgg 

cgctacaact 

gccgaagtgc 

gagcatagag 

gctacacctt 

aataatggtg 

tatcacttga 

gcactttttc 

ttgccttgca 

ctgcgctgct 

tcgcgctgct 

ggaacacgtc 

acccgcacag 

catcgctggg 

gcaagatacc 

acagcctatc 

agctgaccaa 



gtgtagcgcg 

ggaggcggtg 

atctacttat 

tatgaccgcg 

tgttccgtgc 

catctgtagg 

tctagcacgc 

catgatcata 

tcgtctaatc 

ctcaacgcga 

gagagggcag 

aagaatattc 

tgccttccat 

tgggtacaag 

cacttttgca 

cttcttcaac 

cccagtctga 

cgtgcaaggt 

gacttcttag 

aacccctagg 

ccttcgaccg 

aattgtaaag 

atcataatta 

acgagtttta 

caataaattg 

taaacatatt 

ctagttttct 

attttataca 

ctctcatcgt 

tgtaaatgtt 

tttgtataat 

ttgtttttgc 

gataatgcat 

gtggaaactt 

caagagataa 

ttttaactca 

tctaagcttc 

ttgagagcag 

actgttaaac 

attaatctat 

atttcgttct 

atgctctcat 

tggtgtgtta 

cgacgatgag 

tccctaccat 

ggtgtgtaat 

gtaacagtaa 

acttcttaat 

ttctatataa 

cttctgcacg 

gctctgccct 

ctctttcaag 

cggccacggc 

ggtggtgaag 

caacctatcc 

ccaggagctc 

gggtgagatt 

caatacactg 



cggcggtggt 

gcgcatcaca 

ggttcgtgca 

gctcatctct 

atcctttctg 

gtgtctcgca 

gctttgttcc 

gcaggaactc 

ttatatcgta 

tagaggatac 

attacctttt 

tttacgagtt 

tgtcagaatt 

gacgttttgt 

gtcatttttt 

agctctgtcc 

aatgttgtcg 

ctaacaatta 

acgtagaata 

cttgtgggca 

ccattctgca 

gaagtccaca 

aattgaaaac 

aatggactta 

taaatatatt 

tttaatgcat 

atttttattc 

aatatttata 

atttccatat 

caccttgcat 

ataaccaaaa 

cgatcgagca 

cttgtgtgtt 

ggaaactacc 

aatttaactc 

tgttgatgtg 

cactacatat 

ctagcacaaa 

ggtgtgtttt 

tttttaagtt 

acgtatcttg 

agcacttgct 

tatcgtgctt 

accgtgttcg 

agtacctgct 

aatatcgtgt 

atctgtcatc 

actccatccg 

tctgtctagt 

atgatatcac 

tcaagcagtg 

tcatccctgc 

cagcactgca 

ctgcggctgc 

ttcctcagga 

agccgtctca 

ccagctgctt 

tctggttcta 



cgggcggcgt 

gctattcact 

agggggaaga 

ccaaatagat 

aagtcttgaa 

tcccgtcttg 

tgaacaaaca 

tcttcttcga 

gtgccttaca 

aatcattcag 

tagcttcgta 

tcaataactc 

ccagtgtggt 

ggtatgctaa 

acatcgcaca 

gcctcgccta 

tcttcttatt 

tagccaggta 

ctccttctgg 

ttggccacac 

gtgtacatcc 

aaagtgaaac 

tgacggtttt 

atcggagtca 

tttccatgta 

tctacttatt 

ttcattttct 

attttattaa 

atctttttga 

caatagggga 

tattttcacc 

aagattagta 

attcttgatg 

gttagatcga 

gcagattcac 

gacgaatatc 

gtggtttgca 

gagaaaaaaa 

tttaaaaaac 

taaaataatt 

tcaattttcg 

cgttcggata 

tgtttagcat 

atgcatcttt 

tgcgcagaag 

tgtgtgtacc 

ccacccactc 

tttgcgtgtg 

ccatgagcta 

tcccattatt 

acgacgatgg 

tataccaggg 

catgggtggg 

gctcgtccaa 

cgctgcaact 

gcaggctcca 

tgggcaatct 

tcccttcatc 



cccggcgtag 

ttagcgcatt 

ttgtagatac 

tcatgccatc 

atgttttatc 

tttatgtttt 

ccttagccgt 

aggatgtccg 

aaccaggcgt 

acatacgtgt 

cgttgtttcg 

gccaaatgcc 

acccaacttt 

catccggtcc 

acatttgacc 

ttgtattttc 

aattttcttc 

tgaatcccga 

tttttgcact 

tcaaaaaata 

attaccgatt 

cacttaaata 

aatgtattct 

tgattaacta 

ttatccttgt 

tctaactatt 

atatttgccc 

gtaccctaat 

gataataatg 

tgaagttgct 

aaaaatttcg 

gtccagcagt 

agaaaatacg 

gaaatggatg 

ttatgagtta 

ggacatccat 

ctatatattt 

agcatcgttt 

tttatataga 

actacttaat 

ctattccttt 

gaagacttga 

aatcattaca 

gtatggcatc 

acttgacgag 

atgcatactc 

tcattgtaaa 

ttctttcaga 

aaccaacatc 

gctcttcgtc 

tgatgctgcc 

gggccagtcg 

tgttgtgtgc 

cctgaccggg 

cagcaacaac 

gcagctggta 

aaccagtctc 

cctgggcaag 
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5281 ctcaccggcc tctataatct tgcactggct gaaaatatgc tgtctggttc catccctacg 
5341 tctttcggcc aattgcgcag attatctttc cttagcttag ccttcaacca cttaagtgga 
5401 gcgatcccag atcctatttg gaacatctcc tctctcacca tatttgaagt cgtgtccaac 
5461 aacctaactg gtacactgcc tgcaaatgca ttcagtaatc ttcctaatct gcagcaggtt 
5521 ttcatgtacc acaaccattt tcatggtcct atccctgcat cgattggtaa tgcttccagc 
5581 atctcaatat ttaccattgg tttaaactct tttagcggtg ttgttccacc ggagattgga 
5641 aggatgagaa atcttcagag actagagctt ccagaaactc ttttggaagc tgaagaaaca 
5701 aatgattgga aattcatgac ggcattgaca aattgctcca atcttcaaga agtggaactg 
5761 gcaggttgca aatttggtgg agtcctccct gattctgttt ccaatctttc ctcttcgctt 
5821 gtatctctct ccattagaga taacaaaatt tcagggagct tacctagaga tatcggtaat 
5881 ctcgttaatt tacaatatct ttctctcgct aacaactcct tgacaggatc ccttccctct 
5941 tccttcagca agcttaaaaa tttacgtcgt ctcactgtag ataacaacag gttaattggt 
6001 tctctcccac cgactatcgg taatcttaca caactaacta atatggaggt ccaatttaat 
6061 gcctttggtg gtacaatacc aagcacactt ggaaacctga ccaagctgtt tcaaataaat 
6121 cttggccaca acaactttac agggcaaatt cccattgaaa tatttagcat tcccgcactc 
6181 tctgaaatut tggatgtgtc ccataataac ttggagggat caataccaaa agaaataggg 
6241 aaacttaaaa atattgtcga attccatgct gattcgaaca aattatcggg tgagatccct 

63 01 agcaccattg gcgaatgcca acttctgcag catcttttcc tgcaaaacaa tttcttaaat 
6361 ggtagcatcc caatagctct gactcagttg aaaggtctgg acacacttga tctctcaggc 
6421 aacaatttgc caggccagat acctatgccc ttaggggaca tgactctgct ccactcgctg 

64 81 aacctttcgc tcaacagctt ccacggcgaa gtgccaacca atggtgtttt tgcaaatgct 
6541 tctgaaattt acacccaagg caatgcccat atttgcggtg gcatacctga actacatctt 
6601 ccgacgtgt^ ccttaaaatc aagaaagaaa aggaaacatc aaattctgct gttagtggtt 
6661 gttatctgtc tcgtttcgac acttgccgtc ttttcgttac tctacatgct tctaacctgt 
6721 cataagagaa gaaagaaaga agtccctgca acgacatcca tgcaaggcca cccaatgatc 
6781 acttacaagc agctggtaaa agcaacggat ggtttttcgt ccagccattt gttgggttct 
6841 ggatctttcg gctctgttta caaaggagaa tttgatagtc aagatggtga aatcacaagt 
6901 cttgttgccg cgaaggtact aaagctagaa actcctaagg cactcaagag tttcacggcc 

- n 6961 gaatgcgaaa cactacgaaa tacgcgacac cggaatcttg tcaagatagt tacgatttgc 

JU 7021 tcgagcatcg ataacagagg gaatgatttc aaagcaattg tgtatgactt catgcccaat 

7081 ggcagtctgg aagattggct acaccctgaa acaaatgatc aagcagagca aaggcacttg 
7141 actctgcatc agagagtgac catactactt gatgttgcat gtgcattgga gcatcttcac 
7201 ttccatggcc ctgaacctat tgtacactgt gatattaaat caagcaatgt gttgttagat 
7261 gctgatatgg tagctcatgt tggagacttt ggacttgcaa gaatacttgt tgagggaagc 
7321 tcattgatgc aacagtcaac aagttcgatg ggaatcaggg ggacaattgg ttacgcagca 
7381 ccaggttaat cctaaactgt ttatgtctac ctcctttcat tgtttttttt ttagatttgc 
7441 tctggtccaa caaaaaatac ctaaagatac agatacttgt acctcacagt actaaatagt 
7501 ttttgatcat tgcattgtta gatccaacga tcagaaaacg atttggtacc gtgaccgtga 
7561 ggtatcggaa tctcgagata tttttttgtt cgaccgtagc aaatctattt ttttgtttgt 
7621 tttcttctct ttaatgtttt atgactatga aataattttt atttctggaa aacagagtat 
7681 ggtgtcggga acactgcctc gacacatgga gatatttaca gttatggaat tctagtgttg 
7741 gaaacagtaa ccgggatgcg gccggcagac agtacattca gaactggatt gagcctccgt 
7801 cagtacgttg aaccgggtct acatggtaga ctaatggatg ttgttgacag gaagcttggt 
7 861 ttggattccg agaaatggct tcaggctcga gatgtttcgc cacgcagcag tattactgaa 
7921 tgccttgttt cactgcttag acttgggctg tcttgctctc aggaattgcc atcgagtaga 
7981 acgcaagccg gagatgtcat caatgaactg cgtgccatca aagagtctct ctcgatgtca 
8041 tccgacatgt gaagatgtga gacatgctga tgttatgttg gagtatttcg ttgtaatgta 
8101 atgtgaaggg tgagtgtgtg actgcttggt tgtaagctat ttcctgatct gcccatcaga 
8161 tcatgtatct gttctattgt tgtatttctc agaacaacca cacacctaag taggagtaca 
8221 caatagtgta tttgtgtgat ttcaatattg gtgcataccc atgctatgtg aacagtcaat 
8281 cggggagcga ttcacaccat accgtgaaat cgacctaatc agctaatcta attctacagg 
8341 ctgcctttgc atgacagtgt gatattaaat tagcccagcc ctttttagca aacgatggga 
8401 gggtcaatgc tctaga 

55 SEQ ID NO* 5 

/ / /translations "MISLPLLLFVLLFSALLLCPSSSDDDGDAAGDELALLSFKSSLL 

YQGGQSIASWNTSGHGQHCTWVGW CGRRHPHRWKLRLRS SNLTG 1 1 S PS LGNLS FL 
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RTLQLSNNHLSGKI PQELSRLSRLQQLVLNFNSLSGEIPAAIJGNLTSLSVLELTNNTL 
SGS I PSSLGKLTGLYNLAIJ^NMLSGS I PT^ 

ISSLTIFEWSNNLTGTLPANAFSNLPNLQQVFMYYNHFHGPIPAS IGNASS IS IFTI 
GLNSFSG\A^PPEIGRMRNLQRLELPETLLEAEETNDWKFMTALTNCSNLQEVELAGCK 

10 FGGVLPDS VSNLSS S LVSLS I RDNKI S GS L PRD IGNLVNLQYLSLANNS LTGSLPSS F 

SKLKNLRRLTVDNinU^IGSLPLTIGNLTOLTNMEVQFNAFGGTIPSTLGN 
LGHNNFIGQIPIEIFSIPAIjSEILDVSHNNLEGSIPKEIGKLKNIVEFHADSNKLSGE 
I PSTIGECQLLQHLFLQNNFLNGS I P I ALTQLKGLDTLDLSGNNLSGQI PMS LGDMTL 
LHSLNLSFNSFHGEVPTNGVFANASEIYIQGNAHICGGIPELHLPTCSLKSRKKRKHQ 

20 I LLLWVI CLVSTLAVFS LLYMLLTCHKRRKKEVPATTSMQGHPMI TYKQLVKATDGF 

SSSHIXGSGSFGSVYKGEFDSQIXSEITSLVAVKVLKLCT 

RNLVKIVTICSSIDNRGNDFKAI\rTOFMPNGSLEDWLHPETNDQAEQRHLTLHQRVTi 

25 

LLD VACALEHLHFHGP E PIVHCD I KS SNVLLDADMVAHVGDFGLAR I LVEGSSLMQQS 

TS SMG IRGTIGYAAPEYGVGNTASTHGD I YS YGI LVLETVTGMRPADSTFRTGLS LRQ 

30 YVEPGLHGRLMDWDRKLGLDSE KWLQARDVS PRS S ITECLVS LLRLGLS CSQEL PS S 

RTQAGDVINELRAIKESLSMSSDM " 
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SEQ ID NO: 6 

DEFINITION Oryza iongistaminata receptor kinase-like protein <Xa21) gene, 

complete cds and family member C, pseudogene. 
ACCESSION U72723 



SOURCE long-staminate rice. 

40 ORGANISM Oryza Iongistaminata 

Eukaryotae; mitochondrial eukaryotes; Viridiplantae; 
Charophyta/Embryophyta group; Embryophyta; Magnoliophyta ; 
Liliopsida; Pbales; Poaceae; Oryza. 
FEATURES Locat ion/Qual i f i ers 

45 source 1 .. 1963 9 

/organism^ "Oryza Iongistaminata" 
/ s trains « IRBB2 1 " 
/ chromos ome= "11" 
/maps«ll q , RG103" 
50 gene 5213.. 18201 

/genes «Xa21" 
CDS join (5213.. 7889 ,8732. .9132) 

/gene="Xa21" 

/note= "disease resistance gene" 
55 /codon_startsi 

/products "receptor kinase -like protein" 
3 1 flanking 9132 . . 15118 

mis cofeature 9645 . .9769 
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/note="Pop-OH, transposon-like element 0 
13040.. 13248 

/note="Ds-ricel, transposon-like element" 
CDS join(15118. .17720,17827. . 18201) 

5 /gene="Xa21« 

/note- 11 family member C; 2 bp deletion causing a 
frame -shift of the ORF compared to family member Al" 
/ codon_s t art = 1 

10 misc_feature 16183.. 16184 

/gene="Xa21" 

/notes "location of a 2 bp delection compared to family 
member Al" 

BASE COUNT 5380 a 4394 c 3800 g 6065 t 

1 aagctttgct catttcttct ccaattacaa ttaatcgtgt gtgcatcaag taattaatta 
9n 61 a gaaactctg gcttgttgaa aggccgcagt gacaattaat cgtgagtgca taggatgggg 

2U 121 aaaacacgag ggatcggtcg accatcgggg ggagcaaaaa tcaagcgctc ccccgcccca 

181 cgacgcgcac atgccacgcc accccaccac gcaccacgtg tgccgttgta aacacctgcc 
241 acgtgtcccc atcacacacc ccgttgtaac aaatcaccca tatattttgg aaaccctata 
301 ttaggagaat tcgtttcatt ttttttctac caaaaatatt tcacctagtg tactcacaat 
9 - 361 gtttcactat gtatagatct aattttgtag taaattgaaa cattcttttg caataaatta 

^ 421 cccatatatt atggaaaccc tctattaagg gaattcgttt cattttttat tccactaaaa 

4 81 atgtttcgcc tagtgtactt gtaatgtttc actatgtatg gattaaatgt tgcagtgact 
541 tgaaacattc tttcgctatt tgctgaaaca ttgtttttat ataaggtgaa acagcgcccg 
601 atttaaacga ttgaaatatt ttcgatctac ttagtgaaac aattccaata tacttggtgc 
661 aacaacgtgc aacatttaat tataattcca taataagctt gtcgacattt gatagtcacg 
721 actagggtat ttggatggta tggggatcat cggtaccagg gtatatgcga gattgaggta 
781 aaagagatgg agatagggat ttttatatag gttcgggccc cttatctgat aggtaatagc 
841 cctacatcct gtttatatgt gattgatata gaaaaaccac agaatacaac aattgggata 
901 acctatctag ccgttgttga cttggcggca cgaacaccaa ctcgtagtcg acgacgggga 
961 agcctttctc ctcgattgtg aactcgacaa gattagagat atcgctagat ccctcttgcc 
1021 ggcctctgta ggtaccggat ggggtgtgtc taggctaatc tcagatgtcg atgtttggcg 
1081 gcgtattggc ttgtgtcttg tggcttctat gttgtgtgtc ccctctcctc caatggaggc 
1141 ttggatttag actcatagat ttccccttgt ccaagtagaa ctagggagac caatatagat 
1201 acaatccgag tagtacttgt cgtttctata tagaactcta ttttgtcctt ccttatccgg 
1261 aactccttct atatatgagg tatgtttccg tataagactt ggtatgtggt gggcctcgcc 
1321 gagcttagtc gattactatt gggtatgtgg tatcctaggc cccagctgcc attttccaca 
1381 aagacagctt gaatcttata acaaaccttg ctgaagctga caaatcctag cccccagcca 
1441 tgaagttgga aaaatcaatt tccgattaca caaattggtt aatacgcaac catttagtgc 
1501 tcttaacatg accaggtttt acatgttcat tcggctctta gaatctgaca agaccttatc 
1561 tgctctgggc gtccccagcc gaaattccat tagttttctc ggaggcttgt cagaacagcg 
1621 taaagggaca ataggactgc cttcaagatg aggcgatata agaggggatc aacagacaaa 
1681 tattgcacat ataaaactta cagaagttga tgtagatgat gagacgacca ccacactagg 
1741 caaagaccag gtgtatagtt gtactcaaca aatcgcagag gtagtgagag atcgctacga 
1801 tctactggtc gaagatcagg tgtaggcgta ttctcgatca cctgaagaag aatctttagg 
1861 tgttgagaga tcgctactat ctactggtca acactagtaa aaaaacctca tagagatcgg 
1921 cactataggt gccgaacagc taaaacccgc acctatatac tttcctctcg tggactcaaa 
1981 gcacgtaaac cgacacttta gcaactatag gtgcggtcta aagagaccga cactatagta 
2041 tagtgctggt tttaaaaaac ccgacacttt aatataatat agtgtcggtc tctttaaaac 
2101 gacacaatat aaatatacgt gtcggttttt aataaacggc acctatcaaa ccgagcctag 
2161 ctgtcgagtc gagccaatcc aggtgacgca tatattagtc tcgtccttat cgcctccggt 
2221 ctctctctct ctctctctct ctctttctct ctcgcctctc tgtgtagcgc gcggcggtgg 
2281 tcgggcagcg tcccggcgta ggcaacagtg gtggcgagat gggaggcggt ggcgcatcac 
2341 agctattcac tttagcgcct tgaataataa tcggcatcat gatctactta tgcttcgtgc 
2401 aagagggaag aatgtagata cacatggtca cagctaagtg ctatgatcgc ggctcatctc 
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2461 

2521 

2581 

2641 

2701 

2761 

2821 

2881 

2941 

3001 

3061 

3121 

3181 

3241 

3301 

3361 

3421 

3481 

3541 

3601 

3661 

3721 

3781 

3841 

3901 

3961 

4021 

4081 

4141 

4201 

4261 

4321 

4381 

4441 

4501 

4561 

4621 

4681 

4741 

4801 

4861 

4921 

4981 

5041 

5101 

5161 

5221 

5281 

5341 

5401 

5461 

5521 

5581 

5641 

5701 

5761 

5821 

5881 



tccaaataga 

gaagtcttga 

atcccgtcct 

ctgaacaaac 

ctcttcttcg 

agtgccttac 

caatcatcca 

ttagcttcgt 

ttcaataact 

tccagtgtgg 

tggtatgcca 

tacatcgcac 

cgcctcgcct 

gtcttcttat 

atagccaggc 

actccttctg 

attggccaca 

agtgtacatc 

aaaagtgaaa 

ctgacggttx 

aatcggagec 

ttttccatgt 

ttctacttat 

cttcatttrc 

aattttatta 

tatctttttg 

tcaatagggg 

atattttcac 

aaagattagt 

tattcttgat 

cgttagatcg 

tcgcagattc 

tggacgaata 

atgtggtttg 

aagagaaaaa 

ttttgaaaaa 

tttaaaataa 

tgtcaatttt 

ctcgttcgga 

tttgtttagc 

cgatgcatct 

ctcgcgcaga 

gttgtgtgta 

tcccacccac 

cgtttgcgtg 

gtccatgagc 

actcccatta 

tgacgacgat 

gctataccag 

cacatgggtg 

gctgctgctg 

cttcctcagg 

cagccgtctc 

ccccgcggcc 

gcgaggtatg 

ttacaaaaat 

ggagtttgat 

cagcagtcta 



ttcatgccat 

aatgttttat 

gtttatgttt 

accttagccg 

aaggatgtcc 

aaaccaggcg 

gacatacgtg 

acgttgtttc 

cgccaaatgc 

tacccaactt 

acatccggtc 

aacatctgac 

attgtatttt 

taattttctt 

atgaatcccg 

gtttttgcac 

ctcaaaaaat 

cattaccgat 

ctacttaaat 

taatgtattc 

atgattaact 

attatccttg 

ttctaactat 

tatatttgcc 

agtaccctaa 

agataataat 

atgaagttgc 

caaaaatttc 

agtccagcag 

gagaaaatac 

agaaatggat 

acttatgagt 

tcggacatcc 

cactatatat 

aaagcatcgt 

actttctata 

ttactactta 

cgctattcct 

tagaagactt 

ataatcatta 

ttgtatggca 

agacttgacg 

ccatgcatac 

tctcattgta 

tgttctttca 

taaaccaaca 

ttgctcttcg 

ggtgatgctg 

99999<=cagt 

ggtgttgtgt 

cgctcctcca 

gagctggacc 

agcaggcttc 

attggagcat 

atcccacgtg 

ggtttgtcag 

ttgagcttca 

ttgactatga 



ccgtacttaa 

cgatgtttta 

ttatgtgtgc 

tggtattaaa 

gtcaacatca 

tgcttctagg 

tatcttctgt 

gagcaattcg 

cttgtcagtc 

tttgtgctcc 

caatttctag 

caagattatc 

ctgcaaatcc 

ctattgcaac 

attccaacaa 

ttttgcatgg 

aatgcacgtc 

tcatctacat 

aatcatataa 

ttcttatttc 

atttataaac 

tttaattatt 

tttaaagatt 

ctttgttgtc 

ttttccctaa 

ggatataaac 

taacctttta 

gttaaacatc 

tgtctgcacc 

gtagtgaaaa 

gtccaagatt 

taaaatttta 

atttctcgat 

tttcccattc 

ttttcacgcg 

gaaaagtttc 

attaattata 

ttcttctcaa 

gacgaagacg 

catatattcc 

tctagggaca 

agaagaccga 

tcctttaaaa 

aattttgcaa 

gaatctgcgt 

tctctcgctg 

tcctgttgtt 

ccggcgacga 

cgctggcatc 

gcggccgccg 

acctgtccgg 

tcggcgacaa 

agctgctgga 

gcaccaagtt 

agattggtgc 

gagagattcc 

acagattatc 

atttgggaca 



acagaacctt 

ccactacgac 

cagcgcacca 

tggaaatact 

cctaggtcat 

ttctcgtact 

acttccagtc 

tttccctcgg 

acaccatatt 

tattcgcaac 

accaactttt 

cgtaccgccg 

atcatattga 

ccctaactct 

gtggatatga 

acaacatata 

atcaataaac 

taggagataa 

taatttaaaa 

taacgatttt 

tttatccgtc 

tttaaaagtt 

ttcaaatgga 

tctttttaac 

acaatttttc 

atagctagaa 

gatctcctcg 

cgagatattt 

accaccatcg 

ccacatatgt 

cgtccacgtc 

atgagagtta 

ccaacgatag 

ttgattatgt 

tatgttttca 

tttaaaaaat 

cactaacagc 

acacggcatt 

accgccacaa 

atgccgaagt 

aagagcatag 

ctgctacacc 

caaataatgg 

gttctcactt 

gagcactttt 

tcttgccttg 

ctctgcgctg 

actcgcgctg 

ttggaacacg 

ccgccgccgg 

gatcatctcg 

ctacctctcc 

gctgagcgat 

gacatcgcta 

cagcttgaaa 

atccgctttg 

aggagctata 

gaacaatcta 



atgttccgtg 

ccatctgtag 

ttctagcacg 

gcatgatcat 

ctcgtctaat 

cctccacgcg 

cgagagggca 

gaagaatatt 

ttgccttcca 

ctgggtacaa 

tcacttttgc 

ttttcttcaa 

gcccagtctg 

ccgtgcaagg 

agacttctta 

aaacccttag 

tccttcgacc 

taattgtaaa 

tatcataatt 

aacgagtttt 

tcaataaatt 

ctaaacatat 

cttagttttc 

aattttatac 

tctctcatcg 

atgtaaatgt 

atttgtataa 

gttgtttttg 

tgataatgca 

ggtggaaact 

accaagagat 

aattttaact 

cttccaagtt 

gtttgagagc 

gaactgttaa 

atattaatct 

ttatttcgtt 

ggatgctctc 

cttggtgtgt 

gccgacgatg 

agtccctacc 

ttggtgtgta 

tggtaacagt 

gaacttctta 

tcttctatat 

cacttctgca 

ctgctctgcc 

ctctctttca 

tccggccacg 

cacccacaca 

ccgtcgctcg 

ggcgagatac 

aactccatcc 

gacctcagcc 

catctctcga 

ggcaatctca 

ccttcatcac 

agtgggatga 



catcctttct 
ggtgtctcgc 
cgctttgttc 
agcaggaact 
cttatatcgt 
atagaggata 
gattaccttt 
ctttacgagt 
ttgtcagaat 
ggacgttttg 
agtcattttt 
cagctctgtc 
aaatgttgtc 
tctaacaatrt 
gatgtagaat 
gcttgtgggc 
gtcgttctgc 
ggaagtccac 
aaattgaaaa 
aaatggactt 
gtaaatatat 
ttttaatgca 
tatttttatt 
aaatatttat 
tatttccata 
tcaccttgca 
tataaccaaa 
ccgatcgagc 
tcttgtgtgt 
tagaaactac 
aaaatttagc 
catgttgatg 
tccactacat 
agctagcaca 
atggcgtgtt 
attttttaag 
ctacgtatct 
atagcacttg 
tatatcgtgc 
agaccgtgtt 
atagtaccag 
ataatatcgt 
aaatctgtca 
atactccatc 
aatctgtcta 
cgatgatatc 
cttcaagcag 
agtcatccct 
gccagcactg 
gggtggtgaa 
gcaacctgtc 
caccggagct 
aagggagcat 
acaaccaact 
atttgtacct 
ctagcctcca 
tggggcagct 
tccccaattc 
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5941 tatctggaac 
6001 gatccctaca 
6061 ccgtttccat 
6121 gatttatggc 
6181 cacagaactg 
6241 catttctgac 
6301 ggggggagtt 
6361 tgaattgaat 
6421 acatctctat 
6481 taaaaactta 
6541 cataggaaat 
6601 gataccatac 
6661 ccttagtggt 
6721 tgtatcaaaa 
6781 agtagaattt 
6841 ttgccagctc 
6901 agccttgggt 
6961 ccagataccc 
7021 cagctttgtg 
7081 ccaaggcaat 
7141 attactagag 
7201 actggccatc 
7261 agccccttca 
7321 agcaacagat 
7381 caaaggaaag 
7441 taaggcgctc 
7501 tcttgtcaag 
7561 aattgtgtat 
7621 tgatcaagca 
7681 tgcctgcgca 
7741 taaatcaagc 
7801 tgcaagaata 
7861 tatagggaca 
7921 ctgatctcta 
7981 gtaattaatt 
8041 tgctaaatat 
8101 acacactcaa 
8161 acttctacat 
8221 tctacattta 
8281 tacattgacc 
8341 ataatataga 
8401 cacaaacagg 
8461 tgtatagcac 
8521 ttaaatatga 
8581 ttgcatctat 
8641 ataattatac 
8701 aaagaatgta 
8761 catggagata 
8821 actgacagta 
8881 ggcagagtga 
8941 agtacaaata 
9001 gggttgtctt 
9061 gaactgaatg 
9121 cttgaattct 
9181 agtggtatct 
9241 gatcatgcat 
9301 tgcctctgtt 
9361 agttggcgcg 



ctttcgtctc 

aatgcattca 

ggcaaaatcc 

aacttgttca 

tatctctgga 

ctaacaaatt 

cttcctaatt 

aagatcacag 

ctctgcaaca 

ggcattctac 

cttactgaac 

acactctcaa 

ccaataccca 

aataacttgg 

catgcagaat 

ttacggtatc 

cagctgaaag 

acatccttag 

ggggaagtgc 

gccaaactct 

aacagaaaac 

ccctcatcac 

agaacttcca 

ggtttcgcgc 

cttaatatcc 

aagagtttca 

atagttacaa 

gacttcatgc 

gaccagaggc 

ccggactatc 

aatgtgctgt 

cttgttgatg 

attggctatg 

gtgctatatg 

gaactaatta 

agctagttca 

atcaaattat 

gaactgatgg 

agaaacactt 

atgaaaaata 

tacaaaaacg 

attcgtacaa 

aatcatacca 

aatttcattg 

catttgtcca 

atccaagttc 

ctatatttcc 

tttacagcta 

cattcagacc 

cggatgttgt 

attctccatg 

gctctcagga 

ccatcaaaca 

gatgttatgt 

accacacgat 

gttctgtgtt 

tgtttggtat 

tgtgcatgcc 



13 

taagagcgtt 
aaacccttca 
ctgcctcagt 
gtggaattat 
gaaatttgtt 
gctccaaatt 
cgttttccaa 
gaagcattcc 
acaatttcag 
tcgcccacga 
ttaatatctt 
acctcacaaa 
gtgaattatt 
agggatcaat 
cgaatagatt 
tttatctgca 
gtctcgaaac 
cagatattac 
caaccattgg 
gtggtggaat 
atttcccagt 
tctacttgct 
tgaaaggcca 
cgaccaattt 
aagatcatgt 
ctgccgaatg 
tttgctcgag 
ccaacggcag 
acttgaatct 
ttcaccgcca 
tagattctga 
ggacctcatt 
cagcaccagg 
aaatagtttt 
aattgcacaa 
tagaggtaca 
gggcgctttic 
aggagtttca 
ttttttcata 
cttggcacta 
aaatatccta 
aagtaattag 
aatttcttta 
gtatttatgt 
gacatccttg 
tttttattta 
tgctcaaaca 
tggaattcta 
cgatttgggc 
tgacacgaag 
tagaagaatc 
attgccatcg 
gaatctctcc 
ctcgtaatgt 
cactaaagtc 
gtatacctgt 
acaaaagata 
ggcatgcacg 



tagtgtcaga 

cctcctcgag 

tgctaatgct 

cacctcgggg 

tcaaactaga 

acaaacattg 

tctttccact 

gaaggatatt 

agggtctctt 

aaacaacttg 

actgctcggc 

cttgttgtca 

caatattcaa 

accacaagaa 

atcaggtaaa 

aaataatttg 

tcttgatctc 

tatgcttcat 

tgctttcgca 

acctgatcta 

tctacctatt 

tataacctgg 

cccattggtc 

gttgggttct 

tgcagtgaag 

tgaagcacta 

cattgataac 

tctggaagat 

gcatcgaaga 

tggccctgaa 

tatggtagcc 

gatacaacag 

tcagcaagtc 

tacctctagt 

aaataagatt 

gattttttta 

tgctctacac 

gaaggatcaa 

tgctagttac 

cttactaatt 

tgttgtgtga 

ccatcatagc 

gatatgtatc 

ttctttatat 

ttatttgtga 

acactgtaaa 

gagtatggcg 

gtgctggaaa 

ctccgtcagt 

ctcattttgg 

actgaatgca 

agtagaacgc 

ggattgtttc 

tttattgcca 

accgtggcta 

attttactct 

gtgatgagtt 

cagcccgagg 



gaaaacaagc 
gtgatagata 
tctcatttga 
tttggaaggt 
gaacaagatg 
aacttgggag 
tcgcttagtt 
ggcaatctta 
ccatcatcgt 
agcggttcga 
accaacaaat 
ttaggccttt 
acactatcaa 
atagggcatc 
atccctaaca 
ttatctggta 
tcaagcaaca 
tccttgaacc 
gctgcatccg 
catctgcctc 
tctgtttctc 
cacaagagaa 
tcttattcgc 
ggatcatttg 
gtactaaagc 
cgaaatatgc 
agagggaacg 
tggatacacc 
gtgaccatac 
cctgttgtac 
catgttggag 
tcaacaagct 
cttccagtat 
gaaactgatg 
atttgccata 
tataggactc 
tgcaatatga 
atttgagtaa 
atttttttat 
cccacatgga 
tatactataa 
aactgattgc 
tgtaaattag 
aataaaaatt 
tatttaacac 
tttcaaatcg 
ttgggctcat 
tagtaaccgg 
acgttgaact 
attctgagaa 
ttgtttggct 
caaccggaga 
cagtgtgtga 
cacttcagat 
tttcctgatc 
gaattgccac 
tattgtttta 
gtgggtttct 



taggtggtat 

tgggcactaa 

cagtgattca 

taagaaatct 

attgggggtt 

aaaataacct 

ttcttgcact 

ttggcttaca 

tgggcaggct 

tcccgttggc 

tcagtggttg 

caactaataa 

taatgatcaa 

tcaaaaatct 

cgcttggtga 

gcatcccatc 

atttgtcagg 

tttctttcaa 

ggatctcaat 

gatgttgtcc 

tggccgcagc 

ctaaaaaggg 

agttggtaaa 

gctcagtata 

ttgaaaatcc 

gacatcgaaa 

atttcaaagc 

ctgaaacaaa 

tacttgatgt 

actgtgatat 

attttgggct 

cgatgggatt 

tttgcatttt 

gagaatataa 

tctattcaga 

tagagctacc 

aatgattatt 

atttttcaat 

ttcacgagct 

ggtagtgaaa 

tcacaatgaa 

ttggggtaac 

attcttaaag 

aatccagcct 

gtaaatttac 

tacatgttat 

tgcatcaacg 

gaagcggcca 

gggcctacat 

ctggctgaac 

gcttagactt 

tatcatcgac 

aggtgggagc 

cgacttctgc 

cagcatatct 

accgcaaccc 

ggggcttcct 

tttttttcca 
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9421 
9481 
9541 
9601 
9661 
9721 
9781 
9841 
9901 
9961 
10021 
10081 
10141 
10201 
10261 
10321 
10381 
10441 
10501 
10561 
10621 
10681 
10741 
10801 
10861 
10921 
10981 
11041 
11101 
11161 
11221 
11281 
11341 
11401 
11461 
11521 
11581 
11641 
11701 
11761 
11821 
11881 
11941 
12001 
12061 
12121 
12181 
12241 
12301 
12361 
12421 
12481 
12541 
12601 
12661 
12721 
12781 
12841 



ttgttattcc 
ccgttgtttt 
acgagtatat 
atttttctct 
aattttgaaa 
agtgaaaatt 
aaacggaaat 
attaccgttt 
catgtctcta 
tgattgcatg 
gtaaatataa 
atttgttacc 
ttttccgata 
gagaaaaata 
atccctaccc 
aatgaataat 
agtcattttg 
atcccatcga 
gggcaaacgc 
tgctgcggcc 
tctcctcctc 
tgacccaccc 
cggcctcggt 
tcgccgacga 
ccctcgcccg 
gctacgtcat 
tcgtctacat 
catgaacgag 
ccgggagttc 
ggaggcgccg 
agctctgcgt 
tccaccatga 
cagccatgaa 
tagtgctctt 
cttatctgct 
acagcgtaaa 
gacaaatatt 
cactaggcaa 
gctacgatct 
ctctaggtgt 
agatcggcac 
actcaaagca 
tatagtatag 
ttaaaacgac 
cctagctgtc 
catctctctc 
ggtggtcggg 
atcacagcta 
cgtgcaaggg 
atctctccaa 
tttctgaagt 
ctcgcatccc 
tgttcctgaa 
gaactctctt 
atcgtagtgc 
ggatacaatc 
ccattttagc 
cgagtttcaa 



gttgcttttt 
tctctatcgc 
ctaacgtaac 
ggaagatttt 
actttcaatc 
tgaatacttt 
agttttgctg 
cttataatat 
cttgactcac 
gctaaaatgg 
catacttatg 
ggttttcgat 
tttccgatat 
tgattatgga 
gcagtaataa 
cgctaagaga 
aaacggccca 
tcggacatca 
accttgctga 
tctcggtggc 
catcgatctc 
gccgccgacg 
gatgctggaa 
caacaccgcc 
cgcgtcgccg 
cgcggctcac 
ggccgccggc 
cgctgctggg 
gaccagcagg 
ctctagtggc 
gctccgcccc 
cagcttgaat 
gttggaaaaa 
aacatgacca 
ctgggcgtcc 
gggacaatag 
gcacatataa 
agaccaggtg 
actggtcaaa 
tgagagatcg 
tataggtgcc 
cgtaaaccga 
tgctggtttt 
acaatataaa 
gagtcgagcc 
tctctctctc 
cggcatcccg 
ttcactttag 
ggaagattgt 
atagattcat 
cttgaaatgt 
gtcttgttta 
caaacacctt 
cttcgaagga 
cttacaaacc 
attcagacat 
ttcgtacgtt 
taactcgcca 



ttcaccacgg 
ttatgttggc 
taacatgtta 
tgtaagtaac 
tagatttaaa 
caaaaattac 
ttataccgat 
ggtaattacc 
agtttagaga 
agttgatttc 
taaagttaaa 
ctgtaccgac 
cgttttcgtt 
aatggtcgag 
tatataacat 
ctgctattaa 
cttcttttcc 
cctgttagcg 
gctccgatcc 
ctgaggttgc 
gtcttcccag 
gaatccgctg 
cattggggct 
gcggagcccc 
ccggcgatct 
ggcgactctg 
ggccgccgtc 
tcagcaagga 
acaccggcat 
gcagtccaga 
ggccacggcg 
cttataacaa 
tcaatttccg 
ggttttacat 
ccagccgaaa 
gactgccttc 
atacttacag 
tatagttgta 
gatcaggtgt 
ctactatcta 
gaacagctaa 
cactttagca 
aaaaaacctg 
tatacgtgtc 
aatccaggtg 
tctctctctt 
gcgtaggcaa 
cgccttgaat 
agatacacat 
gccatccgta 
tttatcgatg 
tgtttttatg 
agccgtggta 
tgtccgtcaa 
aggcgtgctt 
acgtgtatct 
gtttcgagca 
aatgccttgt 



tagatttttt 
ggattttttt 
cttttagata 
agattgaaaa 
agcttttcaa 
tagtaatcga 
cgtttccata 
gtatttctaa 
ttgattgact 
taatttatat 
tatatgtttt 
atatttccat 
tccgacttta 
gctgttttcc 
tttatctcta 
caaggcttat 
atctatatgc 
cgtacgccat 
tccgatcgcc 
tcaaccgaga 
gtcgccgccg 
gttcgacggc 
gcctcagggg 
gcacctcccg 
ccttcatctg 
tcctcttccg 
gctgacgctg 
ccgtttcaag 
cctgcgcctc 
tcgcgcacga 
agtgggagct 
accttgctga 
attacacaaa 
gttcgttcgg 
ttccattagt 
aagatgaggc 
aagttgatgt 
ctcaacaaat 
aggcgtattc 
ctggtcaaaa 
aacccgcacc 
actataggtg 
acactttaat 
ggttttaata 
acgcatatat 
tctctctcgc 
cagtggtggc 
aataatcggc 
ggtcacagct 
cttaaacaga 
ttttaccact 
tgtgccagcg 
ttaaatggaa 
catcacctag 
ctaggttctc 
tctgtacttc 
attcgtttcc 
cagtcacacc 



tttccggatt 
ccgtggtttt 
acgatggtta 
caaatctata 
ctcaaaattt 
caaaaaaaat 
tttaccgtat 
atatgttgat 
atttaatcaa 
agtatagctt 
ctatagttta 
cagtattatt 
ccgttttcga 
gatcgtttcc 
atctttctct 
atatatatat 
attcatgaaa 
cgtcgtcatc 
accatcacca 
agaacatccg 
ccgccacatg 
ggcggccgcg 
ctccacgccg 
cggccaaccc 
cttcgatcgc 
gatgagttgg 
ctccccgtct 
gacagcttcc 
cgcggcgacg 
gccgccgttc 
caagatggcg 
agctgacaaa 
ttggttaata 
ctcttagaat 
tttctcggag 
gatataagac 
agatgatgag 
cgcagaggta 
tcgatcacct 
ctagtaaaaa 
tatatacttt 
cggtctaaag 
ataatatagt 
aacggcacct 
tagtctcgtc 
ctctctgtgt 
gagatgggag 
atcatgatct 
aagtgctatg 
accttatgtt 
acgacccatc 
caccattcta 
atactgcatg 
gtcatctcgt 
gtactcctca 
cagtctgaga 
ctcgggaaga 
atattttgcc 



tccatttttt 
ctttccgaag 
ttaagataag 
cgtgaggtca 
gaatttttga 
atggaaatgg 
tcttatagaa 
atttataggg 
atccctaact 
gaatttattt 
atgtttctgt 
ccatttccgg 
tttcatttcc 
gaccgttttc 
ctctcatatc 
gccgtcgatc 
tacatggtat 
aacctagcta 
atgaacaagc 
ttccgatgct 
gcaaccaccg 
actgctgacc 
gcgaacgtag 
ctccgggtcg 
ggggatgatg 
aacgactact 
gcgacatccc 
acaccacggg 
acggcggcga 
gacacggccg 
gtgcccatcg 
tcctagcccc 
cgcaaccatt 
ctgacaatac 
gcttgtcaga 
gggatcaaca 
acgaccacca 
gtgagagatc 
gaagaagaat 
aacctcatag 
cctctcgtgg 
agaccgacac 
gtcggtctct 
atcaaacgat 
cttatcgcct 
agcgcgcggc 
gcgg t gg cgc 
acttatgctt 
atcgcggctc 
ccgtgcatcc 
tgtagggtgt 
gcacgcgctt 
atcatagcag 
ctaatcttat 
acgcgataga 
gggcagatta 
atattcttta 
ttccattgtc 
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12901 agaattccag tgtggtaccc aactttttgt gctcctattc gcaacctggg tacaaggacg 
12961 ttttgtggta tgctaacatc cggtccaatt tctagaccaa ctttttcact tttgcagtca 
13021 ttttttacat cgcacaacac ttgaccaaga ttatccgtac cgccgttttc ttcaacagct 
13081 ctgtccgcct cgcctattgt attttctgca aatccatcat attgagccca gtctgaaatg 
13141 ttgtcgtctt cttattcatt ttcttctatt gcaaccccta actctccgtg caaggtctaa 
13201 caattatagc caggtatgaa tcccgattcc aacaagtgga tatgaagact tcttagatgt 
13261 agaatactcc ttctggtttt tgcacttttg catggacaac atataaaacc cttaggcttg 
13321 tgggcattgg ccacactcaa aaaataatgc acgtcatcaa taaactcctt cgaccgtcgt 
13381 tctgcagtgt acatccattg ccgattcatc tacattagga gataataatt gtaaaggaag 
13441 tccacaaaag tgaaactact taaataatca tataataatt taaaatatca taattaaatt 
13501 gaaaactgac ggttttaatg tattcttctt atttctaacg attttaacga gttttaaatg 
13561 gacttaatcg gagtcatgat taactattta taaattttat ctgtctcaat aaattgtaaa 
13621 tatatttctc catgtattat ccttgtttag ttatttttaa aagttctaaa catattttta 
13681 atgcattata cttatttcta actattttaa agattttcaa atggacttag ttttctattt 
13741 ttattcttca ttttgtatat ttgccctttg ttgtctcttt ttaacaattt tatacaaata 
13801 cttataatct tattaagtac cctcattttc cctaaacaat ttttctctct catcgtattt 
13861 ccatatatct ttttgagata ataatggata taaacatagc tagaaatgta aatgttcacc 
13921 ttgcatcaat aggggatgaa gttgctaacc ttttagatct cctcgatttg tataatataa 
13981 ccaaaatatt ttcaccaaaa atttcgttaa acatccgaga tatttgttgt ttttgccgat 
14041 cgagcaaaga ttagtagtcc agcagtgtct gcaccaccac catcgtgata atgcatcttg 
14101 tgtgttactc ttgatgagaa aatacgtagt gaaaaccaca tatgtggtgg aaacttggaa 
14161 actaccgcta gatcgagaaa tggatgtcca agattcgtcc acatcaccaa gagataaaat 
14221 ttaactcgca gattcactta tgagttaaaa ttttaatgag agttaaattt taactcatgt 
14281 tgatgtggac gaatatcgga catccatttc tcgatccaac gatagcttcc aagtttccac 
14341 tacatatgtg gtttgcacta tatattttcc cattcttgat tatgtgtttg agagcagcta 
14401 gcacaaagag aaaaaaaagc atcgtttttc acgcgtatgt tttcagaact gttagatggt 
14461 gtgttttttg aaaaaacttt ctatagaaaa gtttctttaa aaaatatatt aatctatttt 
14521 ttaagtttaa aataattact acttaattaa ttatacacta acagcttatt tcgttctacg 
14581 tatcttgtca attttcgctc atcctttctt ctcaaacacg gcattggatg ctctcatagc 
14641 acttgctcgt tcggatagaa gacttgacga agacgaccgc tacaacttgg tgtgttatat 
14701 cgtgctttgt ttagcataat cattacatat attccatgcc gaagtgccga cgaggagacc 
14761 gtgttcgatg catctttgta tggcatctag ggacaaagag catagagtcc ctaccatagt 
14821 acctgctcgc gcagaagact tgacgagaag accgactgct acaccttggt gtgtaataat 
14881 atcgtgttgt gtgtaccatg catactcctt taaaacaaat aatggtggta acagtaaatc 
14941 tgtcatccca cccactctca ttgtaaattt tgcaagttat cacttgaact tcttaatact 
15001 ccatccgttt gcgtgtgttc tttcagaatt tgcgtgagca ctttttcttc tatataatct 
15061 gtctagtcca tgagctaaac caacatctct cgctgtcttg ccttgcactt ctgcacgatg 
15121 gtatcactcc cattattgct cttcgtcctg ttgttctctg cgctgctgct ctgcccttca 
15181 agcagtgacg acgatggtga tgctgccggc ggcgaactcg cgctgctctc tttcaagtca 
15241 tccctgccat accagggggg ccagtcgctg gcatcttgga acacgtccgg ccacagccaa 
15301 cactgcacat gggtgggtgt tgtgtgcggc cgccggcacc cgcacagggt ggtgaagctg 
15361 cggctgcgct cgtccaacct gaccgggatc atctcgccgt cgctgggcaa cctatccttc 
15421 ctcaggacgc tgcaactcag caacaaccac ctgtccggca agatacccca ggagctcagc 
15481 cgtctcagca ggctccagca actggtactg aatttcaaca gcctatcggg tgagattcca 
15541 gctgctttgg gcaatctaac cagtctctcg gttcttgtgc tgactaacaa tacactgtct 
15601 ggttctatcc cttcatccct gggcaagctc accggcctct ataatcttgc actggctgaa 
15661 aatatgctgt ctggttccat cccttcatct ttcggccaat tgcgcagatt atctttcctt 
15721 agcttagcct tcaaccactt aagtggagca atcccagatc ctatttggaa catctcctct 
15781 ctcaccatat ttgaggtcat atccaacaag ctaaatggta cactgcctac aaatgcattc 
15841 agtaatcttc ctagtctgaa ggaggtatac atgtattaca accagtttca tggtcatatc 
15901 ccggcatcga taggtaatgc ttccaacatc tcaatattta ccattggttt aaactccttt 
15961 agcggtgttg ttccactgga gattggaagg ctgagaaatc ttcagaggct agagcttgga 
16021 gaaactcttc tagaatctaa agaaccaaac gattggaaat tcatgatggc attgacgaat 
16081 tgctccaatc ttcaagaagt agaattggga ctttgtaaat ttggtggagt cattcctgat 
16141 tctgtttcca atctttcctc ttccctatta tatctctttt ttcgataaca taatttcagg 
16201 gagcttacct aaggatatcg gtaatctcgt taatttagaa actctttctc tcgctaacaa 
16261 ctccttgaca ggatcccttc cctcatcctt cagcaagctt aaaaatttac atcgtctcaa 
16321 actttttaac aacaaaataa gtggttctct cccattaacc attggtaatc ttacacaact 
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aactaatatg 

cctgaccaag 

tgaaatattt 

gggatcaaca 

gaacaaatta 

tttcctgcaa 

tctggacaca 

ggacatgcct 

aaccaatggt 

cggtggcata 

acatcaaatt 

gttactctac 

atccatgcga 

ttcgtccagc 

tagccaagat 

taaggcactc 

gatagttaca 

tgacttcatg 

agagcaaagg 

attggaccat 

caatgtgttg 

acttattgag 

aattggttac 

tcttttttgt 

aaaacagagt 

attctagtgt 

ttgagcctcc 

aggaagcttg 

agtatxagtg 

ccatcgagta 

ctctcgatgt 

attagtaccc 

cactgaaggg 

catgtatctg 

gtatttgtgt 

caaaaatttt 

gaaatcgacc 

taaattagcc 

gcttgttagg 

cctctcgaac 

agatgccgtt 

acggccaagc 

actaatgcgc 

gtcttcttcc 

cctttggtta 

ctacacagat 

tttgtgccac 

tcatcaacat 

catctccgac 

aattataaaa 

atacatagtt 

aaagtagact 

gattccatct 

gctgaacaca 

ttgtcaattc 



gagctccact 

ttgtttcaaa 

agcattcctg 

ccaaaagaaa 

tcgggtgaga 

aacaatttct 

cttgatctct 

ctgctccacc 

gtttttgcaa 

cctgaactac 

ctgctgttag 

atgcttctaa 

ggccacccaa 

catttgctgg 

tgtgaaagca 

aagagtttca 

atttgctcga 

cccaatggca 

cacttgactc 

crtcacttcc 

ttagatgctg 

ggaagctcat 

gcagcaccag 

gttttcttct 

atggtgtcgg 

tggaaacagt 

gtcagtacgt 

gtttggattc 

aatgccttgt 

gaatgcaagc 

catccggcat 

ttcacaactg 

cgagtgtgtg 

ttctattgtt 

gatttcaata 

gagatgtctg 

raatcagcta 

cagccctttt 

trtctctcttc 

caatcgattt 

acagcgtttt 

ccaagacact 

ttctcgcttg 

ttgtcgtcct 

ggaaacatgg 

ttattttgct 

cttaatctca 

ttgcttgaat 

acgcaaaacg 

tggattaata 

taacagtttg 

ttagaacaca 

ttggaggccc 

actaaacttc 

aatggtacc 



ttaatgcctt 

taaatcttgg 

cactctctga 

tagggaaact 

tccctagcac 

taaatggtag 

caggtaagaa 

cgctgaacca 

atgcttctga 

atcttccgac 

tggttgttat 

ccggccataa 

tgatcactta 

gctctggatc 

caagtcttgt 

tggccgaatg 

gcatcgataa 

gtctggaaga 

tgcatcagag 

atggccctga 

atatggtagc 

tgatgcaaca 

gttaagccta 

ctctagtgtt 

gaacactgcc 

aaccgggaag 

tgaaccgggt 

cgagaaatgg 

ttcactgctt 

cggagatgtc 

gtgaagatgt 

atttcattct 

actgcttggc 

gtatttctca 

ttgatgcata 

aagttaacaa 

atctaattgt 

tagcaaagga 

tctctcggtt 

cgccaccgtc 

atagacgcaa 

ccacggccca 

ccttggcata 

tgaggttctc 

gtggcagcaa 

aggtgaaata 

gtggcacata 

atctcgacat 

aagcaccatt 

tattttttaa 

aaaagcgtgt 

gcctaagtat 

tccttgtttc 

acaaaacttg 



cggtggtaca 
ccataataac 
aattttggat 
taaaaatatt 
cattggtgaa 
catcccaata 
tttgtcaggt 
ttcgttcaac 
aatttacatc 
gtgttcctta 
ctgtctcgtt 
gagaagaaag 
caagcagctg 
ctttggctct 
tgccgtgaag 
cgaaacactg 
cagagggaat 
ttggctacac 
agtgaccata 
acctattgta 
ccatgttgga 
gtcaacaagt 
aactgtttat 
ttatgactat 
tcgacacatg 
cggccgacag 
ctacatggta 
cttcaggctc 
agacttgggt 
atcaatgaac 
tggagtattt 
gccgtggtat 
tgtagctatt 
gaataaccac 
tatacccatg 
tcaatcagga 
acatgctgcc 
ttggagggtt 
ctcttgttag 
gccaccaggt 
ctcacgccta 
tttaggcccc 
atctggaact 
agtcgcagca 
cctttccatg 
ccttctcacc 
tatgtacaca 
gacttttata 
cgcgcatgat 
aagccacgct 
cgtacggaaa 
ggcagttcat 
ctgcagctca 
tagtgcttca 



ataccaggca 
tttataggtc 
gtgtctcata 
gtcgaattcc 
tgccaacttc 
gctctgactc 
cagataccta 
agcttccacg 
caaggcaatg 
aaatcaagaa 
tggacacttg 
aaagaagtcc 
gtaaaagcag 
gttttcaaag 
gtaccaaagc 
cgaaatactc 
gatttcaaag 
cctgaaacaa 
ctgcttgatg 
caccgtgata 
gactttggac 
tcgatggtaa 
gtctacctca 
gaaataattt 
gagatattta 
atagtacatt 
gactaatgga 
gagatatttc 
tgtcttgctc 
tgcgtgccat 
cgttgtaatg 
ttagttattt 
tcctgatctg 
acacctaagt 
ctatatgcta 
agcgattcac 
tttgcatgac 
aatgttctag 
attatggaac 
tccaaatcga 
gactttcttc 
ttgtatttag 
ccgccttcga 
cccagcaccc 
ttggaatatc 
ttggtaatta 
atagatatct 
aatctaaggg 
taatcaaata 
cctataatat 



cacttggaaa 
aaattcccat 
ataacttgga 
atgctgattc 
tgcagcatct 
agttgaaagg 
tgtccttagg 
gtgaagtgcc 
cccatatttg 
agaaaaagaa 
ccgtcttttc 
ctacaacgac 
cagatggttg 
gagaatttga 
tggaaactcc 
gacacgtcaa 
caattgtgta 
atgatcaagc 
ttgcatgtgc 
ttaaatcaag 
tcgcaagaat 
tcagggggac 
tttcatttct 
ttgctactgg 
cagttatgga 
cagaactgga 
tgttgttgac 
gccatgcagc 
tcaggaattg 
caaagagtcc 
tgatgtgtct 
acaagagagt 
cccatcagat 
acacaacact 
gaattatata 
accaaaccgc 
agtgcgatat 
agaaaaggat 
caattgattt 
tattccggcg 
tcggtacaga 
tgttgctttt 
tctgtagccc 
taaacagatg 
tcgaaatctc 
acatttcaaa 
cttggttagc 
tacgttcagt 
ttagctaaaa 
tttttaaaaa 



acgggagagg tgaagttggc 
tggcctcaag ttttcaacat 
cttccaggct cagcccagcc 
tgtttcaagg aatgatgact 
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SEQ ID NO: 7 

translation^ "MISIiFLLLFVIiLFSALLLCPSSSDDDGDAAGDELALLSFKSSLL 
5 YQGGQSLASWNTSCTGQHCTWGWCG PSLGNL 

SFLRELDLGDNYLSGEIPPELSRLSRLQLLELSDNS IQGSI PAAIGACTKLTSLDLSH 
^ NQLR04 1 PRE IGAS LKHLSNL YL YKNGLSGE I PS ALGNLTS LQEFDLS FNRLS G AI PS 

SLGQLS SLLTMNLGQNNLSGM I PNS IWNI^SLRAFSVRENKLGGMIPTNAFKTLHLLE 
VIDMGTNRFHGKI PAS VANAS HLTVT QI YGNLFS G I ITSGFGRLRNLTELYLWRNLFQ 
TREQDDWGF ISDLTNCS KLQTLNLGENNLGGVLPNS FSNLSTSLS FLALELNKI TGS I 
PKDIGNLIGLQHLYLCNNNFRGSLPSSLGRLKNLGILLAYENNLSGSIPLAIGNLTEL 
NILLLGTNKFSGWIPYTLSNLTNLLSLGLSTNNLSGPIPSELFNIQTLSIMINVSK1W 
LEGSIPQEIGHLKNLVEFHAESiniLSGKIPNTLGDCQLLiRYLYLQNNLLSGSIPSAIjG 
QLKGLETLDLSSNNLSGQI PTSLADITMLHSLNLSFNSFVGEVPTIGAFAAASGISIQ 
25 GNAKLCGG I PDLHL PRCC P LLENRKHF PVL P I S VS LAAALA I LS S LYLLI TWHKRTKK 

GAPSRTSMKGHPLVS YSQLVKATDG FAPTNLLGSGS FGS VYKGKLN I QDHVAVKVLKL 

ENPKALKSFTAECEALRNMRHRNLVKIVTICSSIDNRGNDFKAIVYDFMPNGSLED 
HPETI^OADQRHI^IJIRRVTIL^ 

HVGDFGLARILVDGTSLIQQSTSSMGFIGTIGYAAPEYGVGLIASTHGDIYSYGILVL 



15 



20 



30 



35 



40 



E I VTGKRPTDSTFRPDLGLRQYVEIX3LHGRVTDVVDTKL I LDS ENWLNSTNNS PCRR I 

TECIVWLLRLGLSCSQELPSSRTPTGDI IDELNAI KQNLSGIiFPVCEGGSLEF r 



SEQ ID NO: 8 



DEFINITION Oryza sativa receptor-like protein gene, family member E, 

complete cds. 
ACCESS ION U72 724 
SOURCE rice. 
45 ORGANISM Oryza sativa 

Eukaryotae; mitochondrial eukaryotes; Viridiplantae; 

Charophyta/Embryophyta group; Embryophyta; Magnoliophyta; 

Liliopsida; Poales; Poaceae; Oryza. 

50 FEATURES Location/Qualifiers 
source i. .9424 

/organism= "Oryza sativa" 
/strain^ " IRBB2 1 H 
/chromosome= "11" 
2819. .5260 

/note="Xa21 gene family member E" 
/codon_start=i 

/product="receptor kinase-like protein" 
SUBSTITUTE SHEET (RULE 26) 
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misc_f eature 5211 . . 8128 

/note= 11 • truncator' , an insertion sequence with the 
characteristics of a transposon" 
5 misc_f eature 5484., 5665 

/notes" * Snap-012 * , transpo son -like sequence" 
intron 8357. .8644 



ORIGIN 

10 1 aagcttcact tttcccctat 

61 agacatgtat cccttttaca 
121 ggacacacat ttgaagtatt 
181 taaactgcga gacgaattta 
241 gcactacatt gttagatcat 
15 301 acaaactgcg taattagttt 

361 atgtgaccgg acaaaaaaaa 
421 agagttgata tacattttca 
481 tgacaaatct ctcgcaaata 
541 gcatctcgct ggtctttgtg 
20 601 gtttattgca tgtcatctaa 

661 atcaatatgt aatatatcaa 
721 taataaaaat aagaaacaaa 
781 tttttgttat aacctgtaga 
841 catattgatc tatcttgtca 
25 901 aataaacgga tggacgtcca 

961 cgacaccgat gactcccggt 
1021 ccatgatcta aacataatta 
1081 atcgattttc cggatccgcc 
1141 gtgtgttggg ctagctaggt 
30 1201 ggtacaatag ttcttttgca 

1261 tgccattaac tttgtcgcac 
1321 gccatcgact ttttcttaac 
1381 tactgtacaa atttgttgaa 
1441 gataaaatta tcaaaatatt 
35 1501 aaaaacttga aatatgatat 

1561 gggtatttrg gtcattttga 
1621 tggtaaattg tagaagttaa 
1681 tggcaaatca tagacgtgtg 
1741 tctctcagaa atttgggccc 
40 1801 gtatcaacct ctcttctgta 

1861 tacagggata tacaagagga 
1921 actggccggt gggccagcct 
1981 cccggcatgg gggcatcgct 
2041 ccgggtaata gcatgctccg 
45 2101 tagtgcgcat ttattcgcgc 

2161 gaggcataat tcactccggt 
2221 ccttaggtag cctccggagc 
2281 cggtcactag gtgtgtcacc 
2341 cggtcggccg ccatccgaag 
50 2401 attctctcgg atctccggag 

2461 ttgattcacg gtcatctccc 
2521 ctagcgtccc ttgatccgga 
2581 caatgaccat gacccttttg 
2641 gccaaccgtt ttccagtcag 
55 2701 gtggttggtc aggctacatc 

2761 aaacatctct cgctcttgct 
2821 gatatcactc ccattactgc 
2881 gagcagtgac gacgacggtg 



tttgttaaga tcagttaatg gtgtcaattc agccataaaa 
tttgcccaca aaaaaatttg ctgggtcaca ttagatatac 
aaacgtagac taatacaaag caaattatat aatccgcatg 
ttaagcctaa ttaatccgtc attagcaaat gtttactgta 
ggtgcaatta ggcttaaaag atttgtctcg taatttacac 
tttcaatatt taatacttcg tacatgtatt taaacgtttg 
aattgccggt gggtctaaat ccaccacccc ccctcctctt 
cacaacttgc acggctgcaa aacttgacta aaaagattca 
ttgagatagg aaaagagaga gagaaaatca ggtagcaacg 
aacgggaaac tattttaaca gctcgagtgg acgtcaaccc 
atggttataa aaaaaattga aaaaatatga ataaggatag 
tccacaaaca tgcaagttaa aatttaactt ctacaagttg 
actcaaatta ctatatgtat atttacaatt aaatttgtta 
agttaaattt gaacttgcat gtttgtggag tgatatatta 
atttttttga aaatttttcg taaccatcta gttgacatgc 
ctcgagtgct gctagggcgg atctatagta cgttgtcagg 
aaaaacctat agcaaaactg cttattcata tgttataaca 
gtatactatg ataccgttag acacgttatg acaccaatga 
acagtgcacg ggattgggat gggaacggcc gatggacgct 
ggggttggga gttgggacta ggtaggttat ttttgccagg 
ttttgtaact cttctttttc tttcgagata atccattata 
gtctacgatt tgccactgac tttgtcacgt tctacaatat 
ttctacgatt taccatcgcc gtccggttag ccacttttag 
atgaccaaaa tacccctatg acaaaaatat ccaaaatttg 
atattataaa cataagattg taaacatcca aaatttgacc 
ttcataattt ttatccaaat tttgaaacct tttctcccag 
caaatttgta cagtactaac ggagactaac cggacggcga 
gcaaaagtcg atggcatatc gtagaacatg ataaagtcag 
acaaagtcag tggtatataa tggtttctct cttttctttt 
agctaatttt tgccaatgtt gttgcaagaa ctcaacaagt 
ttccatctgc cttctatggg tgattattta cacatttata 
aaatgcttcg gtgcccctat ccttacgatg tgcctatgtg 
tggtcctaat ggtaacttga tggaggacag gagagctgcc 
gttcccgagg gaccttgcta ctacccgggt aaaggcatcc 
cctagctgcc atcaatagtg tcagtgttag ttggggcgcg 
ccctagcgcc tggttcttga cttgcttgcg cgagcacatc 
tggcacatcg ggctttaggg gcccctcgga gctccgaagc 
tccccttccg gactccgggt acctttcagg cttcggagct 
ggagcccatg gcgctcgacc tcccggagct cttccggagc 
cctgcttagg tcttgttcct ctcgccctca atgagatccg 
gcctccggag caggggggcc agccgcggac cgacatggcc 
ggggaggatg gttctccgga gtgctgcagc ccttcggagg 
gatactcccc taacaccttc ttattcaacc aagctaggac 
gatatcaaga tgaccacagg tttagatatc ctcttaatca 
aaaatcaagt gtgccaacaa gttgcggacc aagaatgttg 
actttttctt atatctgtct aagtccatga gctaaaccaa 
gtcttagctt gcaccgatat tctctgcatc tcggcacgat 
tcttcgtcct cttcttctct gcgctgctgc tcttcccttc 
gtggtgatgc tgccggcgac gaactcgcgc tgctctcttt 
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10 



15 



20 



25 



30 



35 



40 



45 



50 



55 



2941 
3001 
3061 
3121 
3181 
3241 
3301 
3361 
3421 
3481 
3541 
3601 
3661 
3721 
3781 
3841 
3901 
3961 
4021 
4081 
4141 
4201 
4261 
4321 
4381 
4441 
4501 
4561 
4621 
4681 
4741 
4801 
4861 
4921 
4981 
5041 
5101 
5161 
5221 
5281 
5341 
5401 
5461 
5521 
5581 
5641 
5701 
5761 
5821 
5881 
5941 
6001 
6061 
6121 
6181 
6241 
6301 
6361 



caagtcaccc 
tggccagcac 
gaagctgcgg 
atccttcctc 
gctcagccgc 
gattccagct 
actgtccgga 
ggctgaaaat 
tttccttagc 
ctcctctctc 
tgcattcagt 
tcgtatcccg 
ctcttttagc 
gcttccagaa 
gacaaattgc 
ccctgattct 
aatttcaggg 
cgctaacaac 
tcgtctcact 
tacacaacta 
acttggaaac 
aattcccatt 
taacttggag 
tgctgattcg 
gcagcatctt 
gttgaaaggt 
gtccttaggg 
tgaagtgcca 
ccatatttgc 
gaaaaagaaa 
cgtcttttcg 
tgcaacgaca 
ggatggtttt 
agaatttgat 
ggaaactcca 
acaccggaat 
tttcaaagca 
tgaaacaaat 
gaatttctat 
cgagacacac 
taaaatgtca 
aaataagata 
aatcctccac 
agtatatgac 
atggcatagt 
aaacagttaa 
gcccaaccat 
ccattccaaa 
gaagggtgag 
attcgaatag 
atcatgtcac 
cccttgcata 
acaaggcatg 
gccccgcaat 
ggcacggtta 
accatcaaaa 
ttaacaaatc 
tcataagtta 



ctgctatacc 
tgcacatggg 
ctgcgctcct 
aggacgctgc 
ctcagcaggc 
gctttgggca 
gcaatccctt 
acgctgtctg 
ttagccttta 
accatattcg 
aatcttccta 
gcatcgatag 
ggcgttgttc 
actctttcgg 
tccaatcttc 
gtttccaatc 
agcttaccta 
tccttgacag 
gtagataaca 
actaatatgg 
ctgaccaagc 
gaaatattta 
ggatcaatac 
aacaaattat 
ttcctgcaaa 
ctggacacac 
gacatgcctc 
accaatggtg 
ggtggcatac 
catcaaattc 
ttactctaca 
tccatgcaag 
tcgtccagcc 
agtcaagatg 
aaggcactca 
cttgtcaaga 
attgtgtatg 
gatcaagcag 
ccaaaattcc 
aataacaaat 
ttacagaggt: 
aacggcgcag 
accatcagct 
atactcagca 
agggttttat 
gtaattaaac 
tctgaacaac 
ccaggagcta 
actaatcacg 
ttttactctg 
catgtgcctc 
acacaatcca 
gactccccag 
gaaccatgct 
atgtttcaca 
ccatgtgctc 
acgattgacc 
tcccaatagt 
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aggggggcca 
tgggtgtcgt 
ccaacctggc 
aactcagcga 
tccagcaact 
atctaaccag 
catctctggg 
gttccatccc 
acaatttaag 
aagtcatatc 
gtctgcagga 
gtaatgcttc 
caccggagat 
aagctgaaga 
aagaagtgga 
tttcctcttc 
gagatatcgg 
gatcccttcc 
acaagttaat 
aggtccaatt 
tgtttcaaat 
gcattcccgc 
caaaagaaat 
cgggtgagaa 
acaatttctt 
ttgatctctc 
ttctccactc 
tttttgcaaa 
ctgaactaca 
tgctgttagt 
tgcttctaac 
gccacccaat 
atctgttggg 
gtgaaatcac 
agagtttcac 
tagttacgat 
acttcatgcc 
agcaaaggca 
aaacgcttac 
tgataataga 
agatagttcc 
acggctccac 
tcactgtaga 
agccacgcag 
ttgcaaaagc 
aatattaatc 
catcccggct 
atcaaattat 
aaagatattg 
atcagaggtg 
aataccacca 
ctgcagtgca 
cgaccctcgt 
atataaaaga 
accgaaactc 
acaacccacc 
atcgtgaact 
aagctaatgt 



gtcgctggca 

gtgcggccgc 

cgggatcatc 

caaccacctg 

ggtactgaat 

tctctcggtt 

caaactcaca 

atcatctttc 

tggagcgatc 

caacaagcta 

ggtatacatg 

caacatctca 

tggaaggatg 

aacaaatgat 

actgggaggt 

gcttgtatct 

taatctcgtt 

ctcttccttc 

tggttctctc 

taatgccttc 

aaatcttggc 

actctctgaa 

agggaaactt 

ccctagcacc 

aaatggtagc 

aggtaacaat 

gctgaacctt 

tgcttctgaa 

tcttccgacg 

ggttgttatc 

ctgtcataag 

gatcacttac 

ttctggatct 

aagtcttgtt 

ggccgaatgc 

ttgctcgagc 

caatggcagt 

cttgactctg 

atgtgtgtga 

gtacaattat 

tctcaatcaa 

tccacaggca 

actcttcctc 

caaatatgca 

agcatttagc 

caacgctata 

gcacagttct 

taccaattaa 

ttagacccgc 

taccactgta 

cggtacctcg 

ccttcctgga 

gggcttatct 

taaagccgtt 

gtgaaccggt 

attatcaggt 

atcattaagc 

ttctaagcag 



tcttggaaca 

cggcacccac 

tcgccgtcgc 

tccggcaaga 

ttcaacagcc 

cttgagctga 

ggtctcactg 

ggccaattgc 

ccagatccta 

agtggtacac 

tattacaacc 

atatttacca 

agaaatcttc 

tggaaattca 

tgtaaatttg 

ctctccatta 

aatttacaat 

agcaagctta 

ccattgacca 

ggtggtacaa 

cacaataact 

attttggatg 

aaaaatattg 

attggtgaat 

atcccaatag 

ttgtcaggtc 

tcgttcaaca 

atttacatcc 

tgttccttaa 

tgtctcgttt 

agaagaaaga 

aagcagctgg 

tttggctctg 

gccgcgaggg 

gaaacactgc 

atcgataaca 

ctggaagatt 

catcagagag 

accctcgtcc 

tactctaatt 

taaagatcta 

gcttgaccaa 

tgatgaatga 

agtgcacagg 

aaacatttga 

caacataccc 

atctccaaac 

agcacctttt 

ctataaccgc 

cccacaagac 

gaaaggagtt 

tcataatcac 

ccgccacttc 

gcccatgctg 

ccttaattgt 

tttagttggc 

catcattaaa 

ggctaagcaa 



cgtccggcca 
acagggtggt 

tgggcaacct 

taccccagga 

tatcgggtga 

ctaacaatac 

atcttgcact 

gcagattatc 

tttggaacat 

tgcctacaaa 

agtttcatgg 

ttggtttaaa 

agagactaga 

tgacggcatt 

gtggagtcct 

gagataacaa 

atctttctct 

aaaatttacg 

tcggtaatct 

taccaagcac 

ttatagggca 

tgtcccataa 

tcgaattcca 

gccaacttct 

ctctgactca 

agatacctat 

gcttccacgg 

aaggcaatgc 

aatcaagaaa 

cgacacttgc 

aagaagtccc 

taaaagcaac 

tttacaaagg 

tactaaagct 

gaaatactcg 

gagggaatga 

ggctacaccc 

tgtcacgccg 

aggaatcagc 

aataagcgta 

agcagcggaa 

ggctacacct 

ttgcaaggtg 

ataacaaagg 

gaatttaata 

tgttgtatag 

caggaatata 

attatgatga 

gggcacggct 

acaaccccac 

gtgacaatac 

ccccttaaaa 

tcagtctggt 

gcttatggtt 

catgagcacg 

aaataattaa 

taacagtgag 

ttatatctaa 
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6421 tatctagttg aaccaatata taaagctcac tagtcaaatt ataataaccc aaggtatcaa 
64 81 ggaataaagt aatcaataac aaaagggcta taacaaacaa taggttaatt ccacccaatg 
6541 acattcgaaa ataaatgcaa tatttgaata gaaacaatag ctttaaatag gatcaacatg 
6601 ctcaaagggt tgtatgggat ctgtgtgact tgccttgctg gccttggaac tcttcaaact 
5 6661 cttctccggc gaaaacggac tctccggaaa cgacggaatc taaacaaaaa gaagcaaaac 

6721 caccaaaaca gcacataaac caactaaatc ggagctaaga tgaattagtt atgaattttt 
6781 gaagattaaa tcggattaaa acacttaaat tgattttaat tgaattatga cgcaataatg 
6841 aattattttt gaaaaggaaa aggaggatta ttgcgtcagc gggctagggt ttcggtggac 
6901 cgggcacaca ggcgacggct cacgcgaacg gacggccgag atcgacttga tccaaaacgg 
10 6961 acggccgaga tcgaacggtc cacgaccggc tcacagcgaa cggccccgat gacgtcggcg 

7021 atgacgtcac caccggcggc ggcggctcgg cggctcgggc ttgcacgctc gccggcgaac 
7081 gacggcacgg cggcgcgaat ggaaggcacc aacgggtaga gcgcgacgcg gcgaactcac 
7141 cggtgaccaa aagagcggcg gaagatcaat ggacggcgac ggcgacgagg aggaagcggc 
7201 ggcaaacttc gggtcgacgg tggcgacggt gctccggcgg tcttcggcgg cggcaaagga 
15 7261 gcggacgaga acggcggcga cttggcgatc acgacggtgg ccttcccgag cgatgatgac 

7321 gaccgagacg gcggcgacgc acggctggag cgacggctac gacggcggcg ctaggttgca 
7381 cggcgctaga gctcttccgg cgacgagagg cgaaggcgaa ggtggcgacg ggtagaggag 
7441 acaccgggga accttttaaa ggggctcgca ggcgacggcg aaggcccacg gcggctggcg 
7501 acgagaagga aggtttaggg ttcggaggag ggagacgaat ccgattcgaa ctcgattcca 
7561 acgatttcca aaacgaatta gccgatgttt ccaaaagaga aaaggtagag gagatcccgg 
7621 agattgttcc ccctctatca attcggccgg aaacggaaag gatcgatcga atttggaagg 
7681 gaacggcggc ggcgcgaaac tagggtttcg ggcggcggcg gccggaggtt gacgacgacc 
7741 ctgacaggcc ggccccacct gtcagcgggc ggacgcgcgc gcgcggcggc ggactgggcc 
7801 ggactgggcc gaggagagag agagcggttt tgggccgact ttcggcccaa agccaaaaga 
25 7861 gactttttaa aacctttttc aatttaaatt attcatgaaa tgtaattcca tttattaaaa 

7921 atacttcctt agctcaaata aatcccagaa aaatctagga attatagaat taagcaaagt 
7981 atttaatgaa attttatctg gccccatttt atattgtaat ttattaattt aaaattagat 
8041 cttctcttct aggcttttaa aataaattct aaaaattcca tttaaacaac aatttatata 
8101 ttttgaattt tcagggtgtg acagagagtg cccatactac gtgatgttgc atgtgcattg 
30 8161 gaccatcttc acttccatgg ccctgaccct attgtacact gtgatattaa atcaagcaat 

8221 gtgttgttag atgctgatat ggtagcccat gttggagact ttggacttgc aagaatactt 
8281 attgagggaa gctcattgat gcaacagtca acaagttcga tgggaatcag ggggacaatt 
8341 ggttacgcag caccaggtta atcctaaact gtttatgtct acctcctttc attgtttttt 
8401 ttagatttgc tctggtccaa caaaaaatac ctaaagatac agatacttgt acctcacagt 
35 8461 actaaatagt tttcgatcat tgcattgtta gatccaacga tcaggaaacg atttggtacc 

8521 gtgaccgtga ggtatcggaa tctcgagata ttttttgttc gaccgtagca aatctatttt 
8581 tttgtttgtr ttcttctctt taatgtttta tgactatgaa ataattttta tttctggaaa 
8641 acagagtatg gtgtcgggaa cactgcctcg acacatggag atatttacag ttatggaatt 
8701 ctagtgttgg aaacagtaac cgggctgcgg ccggcagata gtacattcag acctggcttg 
40 8761 agcctccgtc agtacgttga accgggtcta catggtagac tgatggatgt tgttgacagg 

8821 cagcttggtt tggcttccga gacatggctt caggctcgag atgtttcgcc atgcagcagt 
8881 attactgact gccttgtttc actgcttaga cttgggctgt cttgctctca ggaattgcca 
8941 tcgagtagaa cgcaagccgg agatgtcatc aatgaactgc gtgccatcac agcgtctctc 
9001 tcgatgtcat ccgacatgtg aagatgtgag acatgctgat gttatgtccg agtatttcgt 
45 9061 tgtaatgtaa tgtgaagggt gagtgtgtga ctgcttggtt gtaagctatt tcctgatctg 

9121 cccatcagat catgtatctg ttctattgtt gtatttctca gaacaactac acaccctaag 
9181 taggagtaca caatagtgta tttgtgtgat ttcaatattg atgcataccc atgctatgtg 
9241 ctaaaattat atactgaaat tttgagatgt ctgaagttaa cagtcaatcg gggagcgatt 
9301 cacaccatac cgcgaaatcg acctaatcag ctaatctaat tctacaggct gcctttgcat 
50 9361 gacagtgtga tattaaatta gcccagccct ttttagcaaa cgatgggagg gtcaatgctc 

9421 taga 

SEQ ID NO: 9 

55 / trans la t ion= "MISLPLLLFVLFFSALLLFPSSSDDDGGGDAAGDELALLSFKSS 

LL YQGGQS LAS WNTSGHGQHCTWG WCGRRHPHRVVKLRLRSSNIjAG USPS LGNLS 
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FLRTLQLSDNHLSGKI PQELSRLSRLQQLVLNFNSLSGEI PAALGNLTS LS VLELTNN 
TLSGAI PS SLGKLTGLTDLALAENTLSGS I PSS FGQLRRLSFLS LAFNNLS GAI PDP I 
iraiSSLTIF^ISNKXSGTLPTNAFSNLPSI^EVYMYYNQFHGRIPASIGNASNISIF 
TIGmSFSGWPPEIGRMimiQIU^LPETLSEAEET^WKFMTALTO 
CKFGGVLPDSVSNLSSSLVSLS IRDNKI SGSLPRD IGNLVNLQYLSLANNS LTGSLPS 
SFSKLKNLRRLTVDNNKLIGSLPLT IGNLTQLTNMEVQF7TAFGGTI PSTLGNLTKLFQ 
INLGHNNFIGQ I P I E I FS I PALS E I LDVSHNNLEGS I PKE IG KLKNI VE FHADS NKLS 
GENPSTIGECQLLQHLFLQNNFLNGS I P I ALTQLKGLDTLDLSGNNLSGQI PMS LGDM 
PLLHSLNI^FNSFHGEVPTNGVFANASEIYIQGNAHIC^ 
HQILLLVWICLVSTLAVFSLLYMLLTCHKRRKKEVPATTSMQGHPMIT^ 
GFSSSHLIXSSGSFGSVYKGEFDSQDGEITSLVAVRVLKLETPKALKSFTAECETIJ^ 

RHRNL VKI VT I C S S I DNRGND F KA I VYD FM PNGS LED WLH P ETNDQ AE QRH LTLHQR V 

SRRNFYPKFQTLTCV M 
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35 



40 
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50 
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DEFINITION 

ACCESSION 
SOURCE 

ORGANISM 



Oryza longistaminata receptor kinase-like protein, family member 
A2, pseudogene sequence, 
U72727 

long- staminate rice . 
Oryza longistaminata 

Eukaryotae; mitochondrial eukaryotes; Viridiplantae; 
Charophyta/Embryophyta group; Embryophyta; Magnoliophyta; 
Liliopsida; Poales; Poaceae ; Oryza. 

Location/ Qualifiers 
1. .5940 

/organism^ "Oryza longistaminata" 
/strains « IRBB21 ■ 
/chromosome= " 11 " 
/maps«iiq r RG103" 
2151.. 5855 

/note= M family member A2; receptor kinase-like protein 
truncated by two mutations" 
/codon_startsi 
/pseudo 
mutation 2501 . .2503 

/notes "mutation found when compared to family member 

Al,GenBank Accession Number U72725" 
/citations [l] 

/replaces «acc" 
mutation 4355 . . 4356 

/no te= "mutation found when compared to family member 
. Al,GenBank Accession Number U72725" 
/citations [1] 



FEATURES 

source 



CDS 
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3' flanking 
misc feature 



BASE COUNT 
ORIGIN 

1 



/replace= M ac" 
5249. .5940 
5453 . .5697 

/note="Tourist-012, transposon-like element" 
1570 a 1200 c 1188 g 1982 t 
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tctagaaatt gaattttaaa attgaatata atagagaaat acaactgaac 
61 tgtcattttc attccattta catgttttta aatctaaatt tacagtgcat 
121 attctaatac aaatttactg taatttgtat ttttagtcat tttttcagtt 
181 cagtgcactt cagtgacctc cttagtaagc tatgggggga aaatggatat 
241 tctatctatg aaacatagaa gaaatattca tcaccatgta ctacttttta 

3 01 ttggccagat aggatctgaa agatttctag tacagggttg tttcatgtat 
361 aggatctgaa ataaccactg tacagggttg tttcttgtat ggtttccgca 
421 tttggttgca gatgtatggc taatcaatta tgcagtttta tggcaaacaa 

4 81 aatagagtat atttcatcat gtggtatatg attgtttttt gttgattaga 
541 tgtgtgtatt tgtttttttg ttctgagaac tattgtctgt tctttatttt 
601 atatgaatgc tgttttgtac agattctttt tgttgtgatg ttcagacaat 
661 ggctgtttgt ctgtttttca ttcaaaattt tcagagtgca caagtgtgtg 
721 agaaattttc agaggattgc ttgttctgtt ctgacctgaa ctagtaaaaa 
781 gatttggctc aatattcaga ttcttgtttt tgtaatcatt gctgtgcaga 
841 actcctggtc atttagattt ttagtctgtg tgtttgttgt tcactgagta 
901 gggcgagagt ctcttttctt ctatgctttt atacagaggt tccaaatcaa 
961 gtgcaatatt cagtgttgtt cagatctatg tatatagctt ctgttctatt 

1021 cagtaatctg ctcatgtaaa taacagaaac cctgtgggta ttttgcatat 
1081 aattcaaagt ttgcagaaac aaatctgttg agcattatgt cttctaaacc 
1141 tttttttctg tgctgatgtt ttttttctgc cctttatgtg aatgtttact 
1201 ctctgtttgg ttttaattca tcagaggagc tattttgcaa gaataaatgt 
1261 catgaagaaa aggaagactc aaaaatatgc tgcgggtcac aggtaccttt 
1321 ttttcttgaa ttttgatccc caatatttct tgttttgagt ctgatctgga 
1381 acttttgatc cttctccaat tctgagtttc ttttccttca tctccatctt 
1441 tttctttctt ctcttcgatc gatctgagtt tacctcttct tctctgtgtt 
1501 aattattttg cctgttttga ttactcgtca cgtgaatatg tcagcgatct 
1561 caagttttgt tgtcacgtgg gtttgctccc ttctttttgg gcttcatttg 
1621 tgagctcata atctgatttt ttttttcttt tcttgtttac taattgagcc 
1681 cctgaaagat tcgtttttta gttctgttgg tgtcaaaaat ctgggctgtt 
1741 ctgttggctg aaaaattcga cagatttgga actctatttt ctgttgactg 
1801 tttttagttc tgctggagcc aaagtctggg ctgttggatc aaaatctgtt 
1861 ttcgacagat ttggaactct attttctgtt gactgaaaga ttagtttttt 
1921 gagtcaaagt ctgggctgtt ggatcaaaat ctgttggctg aaaaattcaa 
1981 actctatttt ctgtcgactg aacagtagac agagttaagc agaaacgaat 
2041 ctatgttcat tgtcttgcgt gagcgctttt tcttctatct gtctgtctag 
2101 aaaccaaaca tctctcgctc ttgcaccaat attctctgca tctctgcaca 
2161 tcccgttatt gctcttcgtc ctgttgttct ctgcgctgct gctctgccct 
2221 atgatggtga tgctgccggc gacgaacttg cgctgctctc tttcaagtca 
2281 accagggggg cttgtcgctg gcatcttgga acacgtccgg ccacggccag 
2341 catgggtggg tgttgtgtgc ggccgccggc acccacacag ggtggtcgag 
24 01 actcgtccga cctgtccggg atcatctcgc cgtcgctggg caacctgtcc 
2461 cgctggacct cagcgacaac cacctgtccg gcaagatacc ctaggaactc 
2521 gcaggctcca acaactggta ctgaatttca acagcctatc gggtgagatt 
2581 tgggcaatct aaccagtctc tcggttcttg agctgactaa caatacactg 
2641 tcccttcatc tctgggcaaa ctcaccagcc tcactgatct tgcactggct 
2701 tgtctagttc catcccttca tctttcggcc aattgcgcag attatctttc 
2761 cctttaacaa tttaagtgga gcgatcccag atcctatttg gaacatctcc 
2821 tattcgaagt catatccaac aagctaagtg gtacactgcc tacaaatgca 

2 881 ttcctagtct gcaggaggta tacatgtatt acaaccagtt tcatggtcgt 
2941 cgataggtaa tgcttccaac atctcaatat ttaccattgg ttttaactct 
3001 ttgttccacc ggagattgga agcatgagaa atcttcagag actagagctt 

3 061 ttttggaagc taaagaaaca aatgattgga aattcatgac ggcattgaca 



ataaatattt 

tttgtatgta 

tctaaaagta 

gcacatctga 

tgaatgtttg 

tgtttcagat 

tatttaaact 

aaagggtggc 

ccagaatagt 

cattcagaca 

atgaatcagt 

ttgttaatcc 

aacaaattca 

atctgtttgt 

tttccattgg 

ttgcagtgtt 

ataatgcctg 

ttaggtacaa 

actaaacagt 

tctctctaat 

caaagaacaa 

tgatttgtga 

agaatcttgc 

tttccccagt 

tgtctggaag 

tcttgcattg 

ggcttttttt 

taagttgggc 

ggatcaaaat 

aaagattagt 

ggctgaaaaa 

agttctgctg 

cagattagga 

atcacaattg 

tgcatgagct 

atgatatcac 

tcgagcagcg 

tccctgcgat 

cagcactgca 

ctgcggctga 

ttcctcagga 

agcagtctca 

ccagctgctt 

tctggagcaa 

gaaaatatgc 

cttagcttag 

tctct caeca 

ttcagtaatc 

atcccggcat 

tttagcggtg 

ccagaaactc 

aattgetcca 
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3121 

3181 

3241 

3301 

3361 

3421 

3481 

3541 

3601 

3661 

3721 

3781 

3841 

3901 

3961 

4021 

4081 

4141 

4201 

4261 

4321 

4381 

4441 

4501 

4561 

4621 

4681 

4741 

4801 

4861 

4921 

4981 

5041 

5101 

5161 

5221 

5281 

5341 

5401 

5461 

5521 

5581 

5641 

5701 

5761 

5821 

5881 



// 



atctacaaga 
ccaatctttc 
tacctagaga 
tgacaggacc 
ataacaacaa 
atatggaggt 
ccaagctgtt 
tatttagcat 
caataccaaa 
aattatcggg 
tgcaaaacaa 
acacacttga 
tgcctctgct 
atggtgtttt 

gcatacctga 
aaattctgct 
tctacatgct 
tgcaaggcca 
caaccaattt 
aagatggtga 
cactcaagag 
tcaagatagt 
tgtatgactt 
aagcagagca 
gtgcattgca 
caagcaatgt 
gagtacttat 
gaacaattgg 
cttctctttt 
ctggaaaaca 
tggaattcta 
tggattgagc 
tgacaggaag 
cagcagtatt 
attgccatcg 
gtccctctcg 
gtctatcagt 
gagtcactga 
agatcatgca 
tcggcagagg 
ctaaacggtg 
taatccattt 
ctcgttttgt 
cacaacactg 
aattatatac 
ccaaaccgcg 
gtgcgatatt 



agtggaactg 

ctcttcgctt 

tatcggtaat 

ccttccctct 

gttaattggt 

ccaatttaat 

tcaaataaat 

tcccgcactc 

agaaataggg 

tgagatccct 

tttcttaaat 

tctctcaggt 

ccactcgctg 

tgcaaatgct 

actacatctt 

gttagtggtt 

tctaacctgt 

cccaatgatc 

ggtgggctct 

aagcccaaga 

tctcacggcc 

racaatttgc 

catgcccaat 

aaggcacttg 

ctatcttcac 

gctgttagat 

tgagggaagc 

ttacgcagca 

ttgaggtttc 

aagtatggtg 

gtgttgaaaa 

ctccgtcagt 

cttggtttgg 

agtgaatgcc 

agtagaatgc 

atgtcatccg 

acccttcaca 

agggtgagtg 

tctgttctat 

agattgtgag 

tgttttttgc 

ttgaagttta 

gtatcttccc 

tatttgtgtg 

aaaaattttg 

aaatcgacct 

aaataagccc 



ggaggttgta 

gtatctctct 

ctcgttaatt 

tccttcagca 

tctctcccat 

gccttcggtg 

cttggccaca 

tctgaaattt 

aaacttaaaa 

agcaccattg 

ggtagcatcc 

aacaatttgt 

aacctttcgt 

tctgaaattt 

ccgacgtgtt 

gttatctgtc 

cataagagaa 

acttataagc 

ggatcgtttg 

cttgtcgccg 

gaatgcgaaa 

tcgagcatcg 

ggcaatctgg 

actctgcatc 

cgccatggcc 

gctgatatgg 

tcattgatgc 

ccaggttaag 

ttctctctag 

tcgggaacac 

cagtaaccgg 

acgttgaacc 

attccgagaa 

ttgtttcact 

aagccggaga 

gcatgtgaag 

actgatttca 

tgtgactgct 

tgttgtattt 

ttagtttgtt 

aaaaaaattt 

aaatagttta 

aatcttctct 

atttcaatat 

agatgtctga 

aatgagctaa 

agcccttttt 



aatttggtgg 

ccattagaga 

tacaatatct 

agcttaaaaa 

tgactatcgg 

gtacaatacc 

ataactttat 

tggatgtgtc 

atattgtcga 

gtgaatgcca 

caatagctct 

caggtcagat 

tcaacagctt 

acatccaagg 

ccttaaaatc 

tcgtttcgac 

taaagaaaga 

agctggtaaa 

gctctgttta 

tgaaggtact 

cactgtgaaa 

ataacagagg 

aagattggct 

agagagtgac 

ctgaacctgt 

tagcccatgt 

aacagtcaac 

tctaaactgt 

tgttttatga 

tgcctcgaca 

gaagcggccg 

gggtctacat 

atggcttcag 

gcttagactt 

tgtcatcaat 

atgttggagt 

ttctgccgtg 

tggttgtagc 

ctcataataa 

ttgttttcca 

ctatatgaaa 

tactcaatta 

tttcccctcc 

tgatgcatat 

agttaacaat 

tctaattgta 

agcaaaggat 



agtcctccct 

taacaaaatt 

ttctctcgct 

tttacgtcgc 

taatcttaca 

aagcacactt 

agggcaaatt 

ccataataac 

attccatgct 

acttctgcag 

gactcagttg 

acctatgtcc 

ccacggtgaa 

caatgccctt 

aagaaagaaa 

acttgccgta 

agtccctaca 

agcaacagat 

cagaggagaa 

aaagctggaa 

cactcgacac 

gaatgatttc 

acaccctgaa 

catactactt 

tgtacactgc 

tggagacttt 

aagttcgatg 

ttatgtctac 

ctatgaaata 

cctggagata 

acagatagta 

ggtagactaa 

gctcgagatg 

gggttgtctt 

gaactgcgtg 

atttcattgt 

gtatttagtt 

tatttcctga 

ccacacacct 

cgcgcacgct 

gttgctttaa 

atcatgtact 

tctcaaactc 

atacccatgc 

caatcaggga 

caggctgcct 

gggagggtca 



gattctgttt 

tcagggagct 

aataactcct 

ctcactgtag 

caactaacta 

ggaaacctga 

cccattgaaa 

ttggagggat 

gattcgaaaa 

catcttttcc 

aaaggtctgg 

ttaggggaca 

gtgccaacca 

atttgcggtg 

aagaaacatc 

ttttcgctac 

acgacatcca 

ggtttttcgt 

tttgatagcc 

actccaaagg 

cgcaatcttg 

aaagcaattg 

acaaatgatc 

gatgttgcct 

gatattaaat 

ggacttgcaa 

gggataaggg 

ttcctataat 

ttttttgcta 

tttacagtta 

cattcagaac 

tggatgttgt 

tttcgccatg 

gctctcagga 

cc at caaaga 

aatgtgatgt 

atttacaaga 

tctgcccatc 

aagggagggt 

tcccgaacta 

aaaatcatat 

aatggctcac 

accctaagta 

tatatgctag 

gcgattcaca 

ttgcatgaca 

atgttctaga 



SEQ ID NO* 11 

ORGANISM Oryza longistaminata 

Eukaryotae; mitochondrial eukaryotes; Viridiplantae; 

Charophyta/Embryophyta group; Embryophy ta ; Magnoliophyta; 

Liliopsida; Poales; Poaceae; Oryza. 
FEATURES Location/Qualifiers 



source 



1. .7204 

/organism= "Oryza longistaminata" 
/strains « IRBB21 " 
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the 



citation. " 



/chromosome= "11" 
/map="llq, RG103" 

join (1683. .1758,1771. .2186,2347. . 4352 , S 147 . . 5547 ) 
/gene="Xa21" 

/note="disease resistance gene Xa21 family member F" 
/codon_start=l 

/except ion="Author-given protein sequence differs from 
conceptual translation for reasons explained in 



/pseudo 

/Product="receptcr-like kinase protein" 
flanking_region 1..1682, 5548.. 7204 



misc_f eature 



repeat_region 



2124 a 



233. .384 

/note = "Gaigin-OH, transooson-like element" 
404. .532 

/note="Gaigin-012, transposon-like element" 
1080. . 1239 

/note="Tourist-OH, transposon-like element" 
6201. .6583 

/note="Crackle, transooson-like element" 
6750. .6956 

/note="Ds-rice3 , transooson-like element" 
7053. .>7204 

/ncte="number of AT repeats has not been determined; 3' 
flanking sequence found in GenBank Accession Number 
U72729" 

/rpt_f ami ly="micrcsatei lite" 
/ rpt_type=tandem 
/rpt_unit=7053 . . 7054 

1455 c 1309 c 2316 t 



BASE COUNT 
ORIGIN 

1 aagctttcta aattatttaa ctctaagtct gttattatcc ccaaatacat catcatcata 
i,i f a * aatatt " "tatrcacg acatccttaa gctagatgct tttggccatt ctcttatctt 
*ai llll^ttt t " ct:c - ccca attaagatga gagtgtcttc tagcaatttg ccagttttta 
: 81 eutgtettt gagtcotcac acattttcar gatgttacca ataaattacg aacaccgtgt 
,41 ttagttctaa agtttttctt caaacttaca acttttcaat cgcatcaaaa ctttctccL 
301 cacacacaaa ctttcaactt ttccatcaca tcgttccaat ttcaaccaaa cttccaattt 
It] a gg ltlr* C taaacaca 9 c cgaaaacaaa atttgtgtgt tatggccctg tttagattct 
;«J aa *" ttcca ttacatcaaa ctttcctaca tacacgaact ttcaactttt ccgtcacatc 
11, !!!!" -ttaaaactt ccaattttaa cgtggaacta aacacaacct atataacgaa 

601 aaCttaat 99 tgaaagtcac acctcaaagg aagggcgcgc ctctagtcaa 

601 gaacatcaat taaaaaggta cacaggttgt actagcttgt tcatgtttaa tcttgcgtct 
-,->, llll q * c * c * ^aatccatgc caaacaaaag tgcttctata gagataatca taaoaatatg 
781 llt l 9 T CC atatccaact gctcagaaga atctcgttcg gaggtgaagg ttaagatgtt 
ll\ ttll !!* cacataaaac aaagcgatct ttttcgcata attaattaag cattagataa 
loi i!!! 2 ! aaaaataaar ^ a atatgatt tttttagaaa aaaaatatat acactaagta 
9 6 } JIS! ?! caa 99 a 9gaa gaaacacaca cttccatata gagagataga aacatagcta 
961 taggtagtgt cactgagtat ttttcatcac gcatatgcat ataaaattag ggggtgttta 
loll " tC " tSgg tgtaaagttt tggoatgtta tatcgagtat tacgtagaat gtcgtattag 
gt ?* tcgggc a =taataaaa aaataattac agaatccgtt agtaaaccgc gagataaatt 
llol ! attaagc= ^ " ttaatc " tcattaacaa atgtttaccg tagcaccaca ttgtcaaatc 
IVil ta 99tttaaa agattcgtct cgcaaattag tcataatctg tgcaattagt 

1261 tatttttaga ctatatttaa gacttcgtac aggtgttcaa acgttcgatg tgacatggtg 
1321 caaaatttta gggtgtcatc tagacactcc cttaattaga aagttaggaa gaggcggtaa 
1381 agaacgcagc atgactgaaa ctttgaaaat ttgataaggt acaccaactg gagtatcttt 
Itol ta " ttcatt 9 aa 9 a ="t9 accagaagag cttgacccgt ttttcttgga gtagccagta 
1501 atgtttcatt cttttccttt tgctgggact tctttttatt ttttttgaca ggagccattt 



SUBSTITUTE SHEET (RULE 26) 



o o 



WO 99/09151 2 ^ PCT/US98/14841 



1561 gttgggactt gggatccctt tactgttata ggaccagtgc ttgaatccaa acactgcatt 
2 * at ' a *f " gctcattgta gcgcac«ct ccgcatgcat ggcgagatca ccaac^tcgg 
1681 tcatgatctc ttctttgctg ctgctgctgt tgatcggccc agcgagcagt g«cg«?g.£ 
lit] "^ctgctgc tgctgctcgt accagtacag gcggcgtcgc cggcgacgaa ctcgcgctgc 
1801 tctctttcaa gtcatccctg ctacaccagg ggggcttgtc gctggcatct tggaacacgt 
1861 ccggccacgg ccagcactgc acatgggrgg gtgttgtgtg cggccgccgc cgccgccggc 
llli ^ CaC ! Cag ggtggtgaa 9 ctgctgctgc gctcgtccaa cctgtccggg atcatctcgc 
1981 cgtcgctggg caacctgtcc ttcctcaggg agctggacct oagcgacaac tacctctcco 
llil lllllll aCC accggagctc agccgtctca gcaggcttca gctgcrggag ctgagcggta 
HV; tTl agggagcatc cccgcggcca ttggagcatg caccaagttg acatcgctag 

„" ! a9CCa caaccaact 9 agattggtgc cagcttgaaa catctctcga atttgtacct 
228? rt!!;!!" at ggttt 9 tca 9 gagagattcc atctgctttg ggcaatctca ctagccttca 
2281 gtattttgat: ttgagctgca acagattatc aggagctata ccttcatcgc tagggcagct 

2*01 ItlTaVclTa 11??*°** tgaatttg ^ acagaacaat ctaagtggga tgatccccaa 

74*1 "* ta * ctgg aacctttcgt ctctaagagc gtttagtgtc agcgaaaaca agctaggtgg 

2521 tlllTJtT aC ! aatgCat - aaaa -ct tcacctcctc gaggtgatag aLtggacac 

2S81 S cat ^ caaaa tccctgcctc agttgctaat gcttctcatc tgacacggct 

III] ^ Cagattgat Waacttgt tcagtggaat tatcacctcg gggtttggaa ggttaagaaa 

2701 "tJS Ctgtatctc ^ 9gagaaa ttt gcstcaaact agagaacaag aaga^gggg 

2701 gttcatttct gacctaacaa attgctccaa atracaaaca ttggacttgg gagaaaataa 

28 2 J act?o!^- a 9ttC ^= Cta a "cgtc-. t = caatctttcc acttcgctS gt^tcttgc 

tlllltl::: ":!!!!!!! Caggaa ^ a - "caaagga* a tt ggcaa t c tt.ttggctt 



llll lit? tatC " tgca «caacaat« cagagggtca cttccatcat cgttgggcag 

2941 gcttagaaac ttaggcantc tagtcgccta cgaaaacaac ttgagcggtt cgatcccatt 

306^ UTT^ aatCtta «5 aactraatat ct,actgctc ggcaccaaca alttcagtgg 

3?2? SL r^ tacacact « caaacctcac aaacttgttg tcattaggcc tttcaactaa 



3121 taaccttagt ggtccaatac ccagtgaatt astcaatatt caaacactat caataatgat 
3241 ^ t9 ^ atCa aaaaataact tggagggatc aataccacaa gaaatagggc atctcaaaaa 
33oi llltt tttcat 9 c ag aatcgaatag artatcaggt aaaatcccta acacgcttgg 

3301 tgattgccag ctctcacggt atctttatct gcaaaataat ttgttatctg gtagcatccc 
3361 atcagccttg ggtcagctga aaggtc^cga aacrcttgat ctctcaagca acaatttgtc 
3421 aggccagata cccacatcct tagcagatat tactatgctt cattccttga acctttcttt 
3481 caacagcttt gtgggggaag tgccaaccat tggt -g C? --. c gcagatgcat ccgggatctc 
„J! aatc «aggc aatgccaaac tctgtggtgg aatacctgat ctacatctgc ctcgatgttg 
3601 tccattacta gagaacagaa agcattttcc agct«acct atttctgttt ctctggtcgc 
3661 agcactggcc atcctctcat cactctactt gcttataacc tggaacaaga gaactaaaaa 
3721 gggagcccct tcaagaactt ccatgaaagg ccacccattg gtctcttatt cgcaottggt 
III] aaaa ! CaaCa gatg S tttc 3 cgccgaccaa tccgttgggt tctggatcat ttggctcagt 
lltl " a " aagga "gcttaata tccaagatca tgttgcagtg aaggtactaa agct tgaaaa 
3961 Ittt a ! 9Ca ctcaa 9 a 9" tcactgccga atgtgaagca ctacgaaata tgcgacatcg 
4o" aocaattlto "IE!*!?" caatttg " c 9agca ttgat aacagaggga acgatttcal 
4081 3 !! a ^ 9 9 tat ^ acttca tgcccaacgg cagtctggaa gattggatac accctgaaac 
Till aaa S a gcagaccaga gg«cttgaa tcrgcatcga agagtgacca tactacttga 

42^ tStS!? gcattggact ^tcttcaccg ccatggccct gaacctgttg tacactgtga 
\->l\ tg ^ aaatca agcaatgtgc tgttagattc tgatatggta gcgcatgttg gagattctgg 
till f:^ 903393 atactt ^^ atgggacctc attgatacaa cagtcaacal ^cLgatggl 
43fli "! tagaggg acaattggct atgcagcacc aggtcagcaa gtccttccag tattttgcat 
til] ctagtgctat atgaaatagt ttttacctct agtgaaactg atggagaata 

itoi a a a?a aa f a ! ttgaaCtaa "aaattgca caaaaataag attltttgcc ataLLttc 
456? SS! tatagctagt tcatagaggt acatatttrtrt tttatatagg aatctagagc 

4651 ^ tacacac tcaaatcaaa ttatgggtgt tttctgctct acactgcaat atgaaatgat 
Urn * a ^ cagaag 9 atcaaatttg agtaaatttg tcaattctac atttaagaaa cacttttttt 
till tlTatltl^ agttattaca ""ttutt tcaagaactt gcattgacca tgaaaagtac 
48oJ £!S*r aC " Ctaattcc cacatggagg tggtgaaaat aatatagata caaaaacgaa 
486? allJt 9 ttgtgtgata tactataatc acaatgaaca caaacaggat tcgtacaaaa 
492? ?^ aa ^ 99CC atcatagcaa ctgattgctt ggggtaactg tatagcacaa tcataccaaa 
498? Itslat ??! tatgtattt 9 taaattagat tcttaaagtt aaatatgaaa tttcattggt 
4981 atttatgttt ctttatataa taaaaattaa tccaaccttt acatctacca tttgtccagc 
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5041 catccttgtt atttgtgata tttaacacgc aattttacat 
5101 tttatttaac actggaaatt tgaaatcgta tttcctactc 
5161 cacattgcat caacacatgg agatatttac agctatggaa 
5221 accgggaagc ggccaactga cagtacattc agacccgatt 
5281 gaactgggcc tacatggcag agtgacggat gttgttgaca 
5341 gagaactggc tgaacagtac aaataattct ccatgtagaa 
5401 tcgctgctta gacttgggtt gtcttgctct caggatttgc 
S461 ggagatatca tcgacgaact gaatgccatc aaacagaatc 
5521 tgtgaaggtg cgagcctcga attctgatgt tatgtcttgt 
5581 cttcagattg gaatgctctt ccgatcagac ttcttcagtg 
5641 aaagtcatcg tggctatttc ctgatccagc atatctgatc 
5701 acctgtattt tactctgaat tgccacacct caaccctgcc 
5761 aagatagtga tgagtatatt gtttcagggg cttcctagtt 
5821 cgcacgcagc ccgagggtgg gtttcttttt ttttccattg 
5881 accacggtag attttttttt tctggatttc cattttttcc 
IniJ at ^ ct 99 c 9g atttttttcc gtggtttttt tttcaagacg 
6001 catgttactt ttagataacg atggttatta aoataagatt 
6061 aagtaaatgg taaaaaatat ggaaataoaa acggaaatag 
6121 tttccatatu taccgtattc ttataoaaat taccgtttct 
6181 atttctaaat atottcatat cgattttgcr atatattacg 
6241 ttgatagatg taattatagr acaatcgtag tgtaattaca 
ttcaaaaatc tctctgtaat atgttatttt 
tacacatatc rgtgcgctgt gattttctitit 
ttataataga tttaacaatt caaaattaco tgaaacttat 
6481 atgcaagtta cagtgtaatt acactacaat tgtactataa 
6541 ggagaaaatt tgtcgacaaa tatataggtg atcccgttga 
6601 acttgactca cagtttagag attgattgac tatttaatca 
6661 ggctaaaatg gagttgattt ctaatttata tagtatagat 
6721 acatacttat gtaaagttaa atatatattt tctatagttt 
6781 cggttttcga tctataccga ccatgtttcc ttcagtatta 
ffiJ tat ttctgat atcgttttcg rttccgagtt taccgttttc 
6901 tatgattatg gaaatggctg aggctgtttt ccaatcattt 
6961 ccgtagcaat aatatataat attttatctc taatctttct 
7021 ttcgctaaga gactgctatt aacaaggctt ttatatatat 
7081 atatatatat atatatagac acacacacat acatacatac 
7141 acatatacat atatatatat ctatacatac atatacatat 
7201 atac 

// 

SEQ ID NO: 12 

DEFINITION Oryza longistaminata receptor kinase-like 



6301 gtatgtaacr 
6361 aacaaatcct 
6421 



aattatacat 

aaacagagta 

ttctagtgct 

tgggcctccg 

cgaagctcat 

gaatcactga 

cattgagtag 

tctccggatt 

aatgttttat 

gtatctacca 

atgcatgttc 

tctgtttgtt 

ggcgtgtgtg 

ttattccgtt 

gttgtttttc 

agtatatcta 

tttttctgga 

ttttgctgtt 

tataatatgg 

acaaattttc 

ctgtaactat 

ggtaaaatag 

cttcctcacc 

acaagttaca 

ttacatctct 

tatttatagg 

aatccctaac 

tgaatttatt 

aatgtttctg 

ttccgtttcc 

gatttcattt 

ccgaccgttt 

ctctctcata 

atatgtacat 

atacatacat 

atatatatgt 



ccaagttctt 

tggcgtcggg 

ggaaatagta 

tcagtacgtt 

tttggattct 

atgcattgtt 

aacgccaacc 

gtttccagtg 

tgccactagt 

cacgatcact 

tgtgttttat 

tggcatacaa 

cttaccggca 

gcttttttcc 

tctatcgctt 

atgtaactaa 

agatttttgt 

ataccgatcg 

taattaccgt 

-cccaaaaat 

agtgtaactt 

aggttgtggg 

aaaacaaaac 

ccgtagttac 

caaattttta 

gtatgtctct 

ttgattgcac 

cgtacatata 

tatttgttac 

ggttittctga 

ccgagaaaaa 

tcttccctac 

tcaacgaata 

atatatatat 

atatatacat 

atacatacat 



sequence of 

ACCESSION 
SOURCE 

ORGANISM 



FEATURES 

source 



protein, 3' flanking 



gene 

misc feature 



family member F. 
U72729 

long-staminate rice. 
Oryza longistaminata 
Eukaryotae; mitochondrial eukaryotes; Vir 
Charophyta/Embryophyta group; Embryophyta 
Liliopsida; Poales; Poaceae; Oryza. 

Location/Qualifiers 

1. .1332 

/organism^ "Oryza longistaminata" 
/strain="lRBB21" 
/chromosome= " 1 1 w 
/map=-ll q , RG103" 

join (U72728: 1683. .7204, 1. ,1332) 
/gene= M Xa21 M 
1..1332 
/gene="Xa21 M 



idiplantae; 

; Magnoliophyta ; 
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55 



/note- "sequence 3" of the microsatellite sequence 

in GenBank Accession Number U72728" 
BASE COUNT 233 a 49S c 380 g 224 t 



present 

BASE C 
ORIGIN 



1 gccgtcgatc aatcatttgg aaacggccca catcttttcc atctaratgc attcatgaaa 
61 tacatggtat atcccatcga tcggacatca cctgttagcg cgtacsccat cgtcgtgatc 
121 aacctagcta gggcaaacgt cctccgatcg ccaccatcac caatcaacaa gctgctgcgg 
w 181 cctctcggtg gcctgaggtt gctcaaccga gaagaacatc cgttccgatg cttctcctcc 

241 tccatcgatc tcgtcttccc aggtcgccgc cgccgccaca tggcaaccac cgtgacccac 
301 ccgccgccga cggaatcccg ctggttcgac ggcggcggcc gcgacrgetg acccggcctc 
361 ggtgatgctg gaacgttggg gctgcctcag gggctccacg ccggcgaacg tcgccgccga 
421 cgacaacacc gccgcggagt cccgcacctc ccgcggccaa cccctccgcg tcgccctcgc 
,S 481 cc 9cgcgtcg ccgccggcga. tctccttcat ctgcrtcgat cgcgcogatg atggctacgt 

541 catc 9cgg" cacggcgact ctgtcctctt ccggatgagt tggaacgact acttcgtcta 
601 catggccgcc gccggcaagc cgccgtcgct gacgctgctc cccgtcrgcg acatccccat 
661 gaacgagcgc tgctgggtca gcaaggaccg tttcaacgac agcttccgca ccacgggccg 
721 ggtgttcgac cagcaggaca ccggcatcct gcgcctccgc ggcggcgagg aogcgccgcc 
9n ™, tct agtggcg cagctccaga tcgcgcacga ggcgccgttc gacacsgccg agctctgcgt 

841 9 c tc=gcccc ggccacggcc acggcgagtg ggagcucaao acgoccgtgc ccatcgtcca 
901 ccacgacggc ggcggcgaac cccgccatgg cctggagatg tggcacoaoa caacgtggcc 
961 gtccccgtcg gcgaccgct- catgtgctgg gccaactacg acctccccac cttcctcatc 
1021 tgcgacatgg cggcggcgoa tcccgacaac cccaagctcc tgtacsttcc gctgccggtg 
9 <r , 081 a accagtgcc acccaaggga gagcgacttc gacgacgacc accaccacga cgagctgatt 

* 141 c"tggggag cacttccgca acatcgtcgc caccggcacc gacacccacg acgacattgc 
1201 gcgattcgtc agcatccaca accgctgctg ctgcggcgcg cccgtcatac acaacctgtg 
1261 cgaacgctcc agctcggcgt tcatggtgaa catctggaag cttgccrgcg gaacccccgc 
1321 gcccgcgaca gc . 

30 SEQ ID NO: 13 

TCAAG^CAATTTGTCAGGTCAAATCCCTGAATTTTTGGCAGGGTTTAGTTTTATATATCTTAACTTATCTTTCA 
ATAATTT7GAAGGCAGAGTGCCAACAGATGGGATATTCAAGAATGCAAGCATTGTTTCAGTCACAGGAAACTCTAA 
?^Z^3^^^^^^^^^^^^^*^"^^^^^^^^^^^^'^^^^^^^^~^^A^AAGGTCTGAAAAAAGGAGAGTTAAGGTA 
ATTGTTGGTATTATTGCAGGAGGTTTAGGAGCAATTTTGGTGGTGTTGTCCTTTATATTTCTTTTGAGATTAAGAA 

CTATAAGGCCACTGATGGGTTCTCCTCAGAAAATTTAAT7GGTACTGGTAGTTTTGGGTCCGTATATAAAGGAATT 

CTTGATGAAGGTGGACCAGTTGTTGCTGTTAAAGTGCTTAACCTCCAGCATCATGGAGCAGCTAAGTCTTTCATGG 

CiGAATGTGAAGCCTTGAGAAATATCAGACACCGGAATCTTGTAAAGATACTAACTGCTTGTTCAGGTGTTGATTA 

an ^ GG ^ TGAT ^ C ^ GGCACTGG " TTA " G AGTACATGGATAATGGAAACCTTGAGGAGTCGTTGCATCTACCA 

W GiTTCAGCAGATAGAAATCATGGGGAGCCTAAGAATCTAAATCTTCTTCAGAGAGTAAATATTGCAATTGATGTTG 

CTTCTGCAATTGAATATCTCCATCATCATTGCGGAAATCCAATAGTTCATTGTGACCTTAAATCAAGCAATGTGCT 
. GTTA 

SEQ ID NO: 14 

translation (1st frame): 

™™" G °"f FLAGFSFIYLNLS ^ 

IVGIIAGGI.GAIt.WI.SFIFLLRLRKKSHKPSSSrSENSLLELPKVSYRDLYKATDGFSSENLIGTGSFGSVYKGI 
LDEGGPWAVKVLNLQHHGAAKSFMAECEALRNIRHRNLVKILTACSGVDYQGNDFKALVYEYMDNGNLEEWLHI.P 
VSADRNHGEPKNLNLLQRVNIAIDVASAIEYLHHHCGNPIVHCDLKSSNVLL 

SEQ ID NO: 15 
DT4 

°°JI G S CGGCGCAG ^ 

J cc ! c SI CG fr TOCGG ^ TCGCCACGCTGACCCTGCTGGATGTGT 

CGGCGCTCGCTCAATGCAGGCAGCTCAGCCTCATCATCGTCCTGACCCACAGCCGCCTGTCAGGGCCGGTTCCGCG 
^ GC ^° GG f TCGCTGCCGCAGCTCGGCGAGCTGGCACTCTCCAAC ^ CG 

GCGGCTTGGTGTCCCTCAACGTACTGAATCTCGCACACAACCAGCTCTCAGGTCCGATTCCGACGACGGTCGCAAA 
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^iSi CC ^ GWCGC ^ TCGGGTCCGGCGGATCAGGCACGGTGTACAGG 

!S g J cgc I ga ; ctgggaagcgcggc t^ 

Xl*~~Z CCTGCTCTGAAGCCGCTG GCGCCGCGTGAGGAGTCGTCGATGACGGAGGTGCTG 

JSSS^SSiSIf GGAGTCGAGTGT AATCCAACACAAACAAGACCTTAGGATTTGAAAAGTGGGCTGGATTGTC 

Ic^f CTCA ^ 

SEQ ID NO: 16 

DM4 cDNA CLONE 

GCATTGCGGCGACTCGGACTTGCCGGGAACAArTT(-arT-fTa^^» » r^^<~/~>~»-»^--»^^^ 




^.^^^.I"„ " "~* iv '- lv -"" v * , -"« , - , -AA-rTGGTTGGCGGTTTGCCAGCGAGCTTCTCAGGATGCAGATCGCT 

«^SSSS TCTCGGGTCTAACATGC TGGAANGAGAGATCATGCCCGAGCTGTGTTCATCTTTGCCATCACT 
^ GA ^ GC ! G " CCTACCCAACAACTACA TCAATGGAACCGTGC TO 

G^riSS AGCTTCAACCTCATGG " GGT ^ 

^JJifl™ 0 AACATAACCGAATGATCCCGT TCNCNTCNCCAGTNGCTNAATCCATATGGTGTCCTTCCGGCAA 
SJJHT NNNNNNNNNNNNNNNGGA TCCTNTTTCNTNACAGNGGGGATTTTTATATGGTGTAATTGCGGC 

SIccgccgScS* A ^ G ttcgctgcagcacctta ^tggttgattttcaacagcaacaatttttccggtgcg 

AGCAGCAACAGGCAGGGNT< = A TCACTGGAGGCATGGTTTTCTGGGAAGCAGTTCGCGTTCCT 
S! GA ;i G ; G f C ! GGAACATCTGCCCAGGTGG TGGAGTGCTGTTCGA 

SSSS^JII GACCTNTCGCACAA TCACCTCACCGGTGTCATCCCTGCTGGACTGCGGTCTTTAAATTTCCTAGC 
^™^ ACG TCTCCAACAACAACCTCACTGG TGAGATACCCACGTCAGGGCAGCTCAGTACATTTCCAGCATCC 

CTCWU^CCCCTCTAACGTGCCGAGGAAGTTTCTCGAAGAGTTCGTGCTCCTTGCACTGTCGCTCACCGTGCTCAT 

a?ctcSgS A ^^ CCGC ^ TCGTC 

GAGAATCCGTTGAGGAAACTAAGA TATGCCCACCTGCATGAGGCTACCAATGGCnrCAGCTC 
JSUSSJI! TOGGCACAGGAGGATTCGGTGAGG TTTACAAGGCTAGGCTCATGGATGGCAGCGTTGTGCCTGTC 
GCATTTCACAGGG CAAGGCGACCGGGAGTTCACTGCAGAGATGGAGACCATTGGCAAGATCAAAC 

^AAGCCTGGATGTCTTGCTCCATGAAAGGGACAAGACTGATGTGGGTCTTGATTGGGCAACAAGGAACAAGATT 
GCAGTTGGCTCGGCAAGAGGACTGGCCTTCCTCCACCATAGTTGCATCCCACACATCATACACCGGGACATGAAGT 
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CAAGCAACGTGCTTCTTGACGATAATCTCGATGCCTACGTATCGGATTTCCCJVATCGCGCGGCTCGTGAATGCTGT 

TGACTCACATCTAACCGTGACCAAGCTCTTAGGAACACCTGGTTATCTGGCTCCCGAGTACTTCCAGTCGGTTATT 

TGCACAACTAAGCGCGACGTCTACAGCTATGGCXJTTGTTCTTCTGGAGCTTCTCTCAGGGAAAAAACCAATCAATC 

CGACTGAATTCGGCGACAATAATCTCATCGACTGGCCCAAGCAGATGGTTAACGAGGACCGGTGCACCGAGATATT 

TGATCCTATATTGACCGACACAAAATCCTGCGAGTCGGAGCTGTACCAGTATCTGGCGATTGCTTGCCAGTGCTTG 

GACGATCAACCTAGTOSCAGACCTACGATGATCCAGCTC^TGGCAATGTTCAGTGACTTTCAGATTCAC^CTGGCA 

GCTTCTTCTTGGACGGCTTCTCGCTCGATTCAGATAGAGGAATCATCTGAAAAAAAATGTGTAAATGTTATTGATC 

CCTGCAGATTATATGATTCACTGGATTTAGGTATTAGCTTAGCCATGTTTAACTCATGTTAACAGGATACAAACAG 

ATGTAAATTTGTTTCGGTTGCCGTACATAGTACACAACAGCTTCAACACAGATACCATATAGAGTTGTTTCCAAAA 
AAAAA 

SEQ ID NO: 17 
TRK1 

ATCGGGCAGGTCTTCAAAATACTTGTTACATCTTCTTCTTACCTTTGATA 

TTTTCCAAAGTATTTGTAACTTCAAATCACTAGTTATCTAAATGGCTACT 

TCTAACACAAGTCTCTTGTTTTTCGCGTATTTCCTCCTTGTGTTCCTTAT 

TACTCCATCTCAATCGCGTAACCTGTCTCTGAGACGACAGGCTAAAACTC 

TAGTTTCATTGAAATATGCATTTGTACAATCATCTGTTCCTAGTACTCTG 

TCCAATTGGAACATGTCGAATTATATGTCTATATGTTCTTGGACAGGTAT 

AACGTGTGATGATACCAAATCAGTAACTTCCATTGATATATCCAATCTAA 

ACATTTCTGGCTCTTTATCACCTGATATTCATGAGCTCACTAGACTTCGC 

GTCCTGAATATTTCTAACAATTTGTTTAGTGGAAACTTAAGCTGGGAGTA 

TCGCGAGTTTAATGTACTTCAAGTGTTGGATGCTTATAACAACAATTTCT 

CTGGTCCACTCCCTTTGGGAGTTACTCAACTTGTGCAGCTCAAGTACTTG 

AATTTCGGGGGTAACTACTTTTCAGGGAAGATTCCTTTGAGTTATGGTAG 

TTTTAATCAGCTTGAGTTCCTGTCTCTTGCTGGGAATGACTTGCACGGTC 

CTATACCGAGGGAGCTGGGGAACGTTACGAGCCTCAGGTGGTTACAGTTG 

GGTTATTATAATCAATTTGATGAGGGGATTCCACCAGAGTTGGGGAAACT 

TGTTAATTTGGTTCATCTAGATCTTTCAAGCTGTAACTTAACGGGTTCGA 

TTCCACCAGAATTGGGCAATCTTAATATGTTGGACACTCTTTTCTTGCAA 

AAGAATCAACTTACTGGTGTATTTCCTCCTCAGCTAGGGAATTTGACAAG 

GTTAAAATCTCTTGATATCTCGGTCAATGAACTCACAGGAGAGATCCCGG 

TTGACTTGTCAGGACTCAAGGAGCTCATATTGTTGAACCTCTTTATCAAC 

AATTTGCACGGTGAGATTCCAGGATGTATCGCGGAGCTGCCAAAGTTGGA 

AATGTTGAATCTTTGGAGGAATAATTTCACTGGCTCGATTCCTTCTAAGC 

TTGGGATGAACGGTAAACTAATTGAAATTGATCTGTCTAGTAATAGACTC 

ACTGGCTTGATACCAAAATCTCTATGCTTTGGGAGGAATTTGAAAATCTT 

GATTCTTCTTGATAATTTTCTGTTTGGACCTTTACCTGATGATTTTGGGC 

AGTGTCGAACGTTGTCCAGAGTCAGAATGGGACAGAATTACTTGAGTGGA 

TCAATACCAACAGGGTTTCTTTATTTGCCTGAGTTGTCACTGGTGGAACT 

GCAGAACAACTACATCAGTGGACAACTCTGGAACGAGAAAAGCTCAGCGT 

CTTCTAAACTTGAAGGGCTGAACCTGTCGAACAATCGCTTGTCTGGTGCA 

CTTCCTAGTGCTATTGGAAACTATTCAGGGCTGAAGAATCTTGTGTTAAC 

TGGAAATGGTTTCTCAGGTGATATCCCTTCTGATATTGGCAGACTAAAGA 

GCATCTTAAAGCTGGACCTGAGTAGAAACAACTTCTCTGGCACAATCCCT 

CCTCAGATTGGTAACTGTCTTTCCTTAACTTACTTGGATTTGAGCCAAAA 

TCAACTTTCTGGTCCTATCCCAGTTCAAATTGCTCAAATTCACATCTTAA 

ATTACATCAATATTTCCTGGAATCACTTCAACGAGAGCCTTCCCGCGGAG 

ATTGGCTTGATGAAGAGTTTAACTTCAGCAGATTTTTCCCACAATAACTT 

ATCTGGATCAATACCTGAAACAGGCCAATATTTATATTTCAACTCAACTT' 

CCTTCACCGGCAACCCTTATCTCTCTGGATCCGACTCGACTCCTAGCAAC 

ATTACATCCAACTCACCGTCAGAACTTGGAGACGGAAGTGACAGCAGAAC 

TAAGGTTC CTACA ATATACAAGTTCATATTTGCATTTGGGCTCTTATTCT 

GCTCCCTCATTTTCGTTGTCTTAGCAATAATCAAGACAAGAAAGGGGAGT 

AAGAATTCAAATTTGTGGAAGCTGACAGCATTTCAGAAGCTTGAGTTCGG 

AAGTGAAGACGTCTTGCAGTGCTTGATAGACAACAACGTCATAGGGAGAG 

GTGGAGCAGGGATAGTGTATAAGGGAACTATGCCAAATGGTGATCATGTC 

GCGGTGAAGAAATTGGGAATAAGCAAAGGCTCACATGATAACQGCCTATC 
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TGCTCAACTTAACACATTACGCAAGATCAGGCATAGGTACATTGTGAGAC 

nS™^^ TOGTTC ^ C ^ GG ^ GTC ^ C1 ^ CTAGT "'ATGAGTAC 

ATGCTAAATGGAAGCTTAGGTGAAGTGCTTCATGGGAAGAACGGCGGGCA 

ACTCCAATGGGAAACTAGGCTAAAAATAGCCATAGAAGCTGCCAAGGGCC 

TTTCTTATTTGCACCACGATTGCTCCCCTATGATAATCCACCGCGATGTC 

A^TCCAACAATATATTGTTGAACTCTGAACTTGAAGCTCATGTTGCAGA 

TTTTGGATTAGCCAAGTACTTTCGTAACAATGGTACCTCTGACTGCATGT 

CTGCAATTGCAGGATCTTATGGCTACATTCCTCCAGAATATGCATACACG 

CTGAAAATTGATGAGAAAAGCGATGTGTATAGCTTTGGAGTGGTGTTGTT 

GGAGCTTATAACAGGACGAAGGCCAGTAGGAAATTTTGGAGAAGAAGGAA 

T GG ^ G ^^ G ^ AGAA ^ GGGGGAAA ^CGGAGACAAAATGGAGCAAAGAAGGG 

^ GG I G ^^ TC " GGATGAGAGGCTA AAAAATGTTGCAATTGTTGAAGC 

I A I^™ TATTTTTTGTAG CAATGCTTTGTGTTGAAGAGTACAGCATTG 

rriaiTnrwp^^*'*'''*'^''^ 0 TCCAAATGCTTTCTCAAG CTAAAC AA 

CCAAATACTTTCCAAATCCAATAATCTAATTGTGGCTCTACTTATTGTAT 

A^^ CACCCC ^ GTTAGCTTTGCAAAAGTGAA ^TCACAAATTA 

ATCTAAGTGAAGTAGTTGCAAAATTAATTTGCAATTATGTTAGATCTTAG 

GTAGAACTAGTATTAGTATCCGCGTGATGTGTGGCGAATATCAAAAGAAA 
SSiS2J^ CTAATXCCKCYG '^ WCWR ^ GKMSWCCMGGM SKCGA 
WYKCCRCCRATACTGACGGACTCCAGGAGTCGTCGCCACCAAT 

SEQ ID NO: 18 
TRK1 

SGRSSKYLLHLLLTFDIFQSICNFKSLVI 

MATSNTSLLFFAYFLLVFLITPSQSRNLSLRRQAKTLVSLKYA 

IHELTRLRVLNISN NLFS GNLPWE 
YREFNVLQVLDAYN NNFS GPLPLG 
VTQLVQLKYLNFGG NYFS GKIPLS 
YGSFNQLEFLSLAG NDLH GPIPRE 
LGNVTSLRWLQLGYYNQFDEGI PPE 
LG KLVNLVHLDLSS CNLT GSIPPE 
LGNLNMLDTLFLQK NQLT GVFPPQ 
LGNLTRLKSLDISV NELT GEIPVD 
LSGLKELILLNLFI NNLH GEIPGC 
IAELPKLEMLNLWR NNFT GSIPSK 
LGMNGKLIEIDLSS NRLT GLIPKS 
LCFGRNLKILILLD NFLF GPLPDD 
FGQCRTXSRVRMGQ NYLS GSIPTG 
FLYLPELSLVELQN NYIS GQLWNEK 
SSASSKI^TGLNLSN NRLS GALPSA 
IGNYSGLKNLVLTG NGFS GDIPSD 
IGRLKSILKLDLSR NNFS GTIPPQ 
I GNCLS LTYLDLS Q NQLS GPIPVQ 
IAQIHILNYINISW NHFN GSLPAE 
IGLMKSLTSADFSH NNLS GSIPET 

^S^^^^^^^^^^^^^^^^^^^^^^^^SNSPSELGDGSDSRTKVPTIYK 
F I F AFG t*LFCS L I FWL A 1 1 

KTRKGSKNSNLWKLTAFQKLE 

KI AIEAAKGLSYLHHDCSPMI IHRDVKSNNILLNSELEAHVADFGLA 

KYFRNNGTSECMSAIAGSYGYIAPEYAYTLKIDEKSDVYSFGWLLELITGR 

RPVGNFGEEGMDIVQWAKTETKWSKEGWKILDERLKNVAIVEAMQVFFVAM 
LCVEEYSIERPTMREWQMLSQAKQPNTFQIQ 
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SNCGSTyCMLCNTPFVSFAKVKSQINLSEWAKLICNYVRS . GMISNYILS 
TWNSVLDV . N . Y . VPRDVWRISKESR? ? ?LIP? ? ? ? ? ? R ?R? PPILTDSRSRRHQ 



SEQ ID NO: 19 

TRK2 



TGACTCTCTCTGTCTCTCTCTGTTCGCAGCCCCAAAAAGTAGCGTTAGGG 

CTAGGGTTTTTGAGTTTCAAAACCCCATTTCTGGTTCCTATAATCTTCAC 

ATACAAGGGGAGTTTGTCTCTGTTGCATTCTTTGAAGACCCTTTTGGGGT 

TTTACTAATGGGTCGTTGTTGTTTTGTCATCAAATGGTACTATCATGACA 
TACCCTTGAAACTTTTTCTCATccTTTfiTfi'i«iw*M»«-~i, « 



*„*~w*_* x a v,v, a «-<**»attcgg AT AAATC AGOG CTCTtGG AGTTAAAGG C 
CTCATTTTCAGATTCCTCTGGAGTGATTTCTAGCTGGAGCTCCAGAAATA 
ATGATCACTGTTCATGGTTTGGTGTCTCCTGTGATTCCGATTCACGTGTT 
GTGGCTTTGAACATCACTGGAGGTAATTTGGGTTCTTTATCTTGTGCTAA 
AATTGCTCAATTTCCTTTGTATCCCTTTGGAATTACAAGGGTTTGTGCTA 
ATAATAGTGTCAAGCTTGTTGGTAAAGTACCTCTCGCAATATCAAAATTA 
ACTGAACTAACGGTTTTATCCTTGCCTTTTAATGAATTGCGTGGTGATAT 
TCCATTGGGAATTTGGGATATGGACAAACTTGAAGTT7TGGATCTGCAAG 
GGAATTTAATTACTGGGTCTTTGCCATTGGAGTTTAAGGGGTTGAGGAAA 
TTGAGGGTTTTAAACTTGGCTTTTAATCAGATTGTGGGTGCCATACCGAA 
TTCCTTGTCAAATTGCCTTGCTCTACAAATCTTTAATCTTGCTGGAAATA 
GGGTAAATGGGACCATTCCAGCATTCATTGGTG GATTTGAAGATCTGAGG 
GGAATCTACCTGTCTTTTAATGAGCTTAGCGGGTCTATTCCTGGTGAAAT 
TGGGCGTTCTTGTGAGAAGCTTCAAAGTCTAGAGATGGCAGGTAATATCT 
TAGGTGGTGTTATTCCAAAAAGTTTAGGGAACTGCACACGGTTGCAGTCA 
CTTGTCTTATATTCAAATT7CTTGGAAGAGGCTATTCCAGCTGAATTTGG 
TCAACTAACTGAGCTCGAGATTCTTCATTTGTCTAGGAACAGCCTAACTG 
GTCGACTACCATCTGAGCTGGGAAACTGCTCGAAACTATCCATTCTTGTA 
10 SI^ AAGTTTGTCGGATCC CCTTCCAAATGTGTCTGATTCAGCTCATAC 
TACTGATGAGTTTAACTTTTTTGAAGGCACAATCCCATCAGAGATCACCA 
GGCTTCCTAGTTTGAGAATGATATGGGCTCCCAGGTCAACTCTTTCAGGA 
AAATTTCCTGGCAGTTGGGGTGCTTGTGACAATTTGGAGATCGTGAACTT 
GGCTCAAAATTATTATACTGGAGTGATTCCTGAGGAATTGGGTAGCTGCC 

<xs ; g ^ g J tgca: ^ ctcgac «gagctcaaataggctgactggacagctt 

GTTGAGAAACTGCCAGTCCCTTGCATGTTTGTGTTCGATGTGAGTGGGAA 



\CTTTCTT 



— ~* "~ wn^iwu XTGGCTGGCAAT AATCTGGTTGC 

CCCCTCAAGTTTTGGCCAATTGCACTCTTTAGAAACGCTTGAACn 

^^" G ^ TGTCTGGTGAAATTCC ^TAATCTGGTAAATTTGAG Gf! VT 

"^ G I TCCC ^ CTOCTGAACAACAA CAATTTATCAGGGAAAATACCTTC- 

AGGCTTGGCCAATGTGACCACACTGGCAGCATTTAACGTTTCTTTCAATA 

™ GTCTGGGCCACTGCCTCTTAA CAAACATTTGATGAAGTGTAATAGT 

"^ GGG ^ CCCCTTTCTGCAATCGT GCCATGTATTTTCTCTATCAAC 

ACCTTCTACAGATCAGCAGGGAAGAATAGGGGACTCACAAGATTCTGCTG 

CGTCTCCTTCAGGTTCAACCCAGAAAGGAGGGAGCAGCGGTTTCAACTCC 

ATAGAGATTGCATCCATAACATCTGCGGCAGCTATTGTGTCAGTTCTTCT 
TGCTCTGATAGTCCTGTTrTTTT irtfr«^» . 
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GTTCCTTTAACATTTGAAAATGTAGTGCCCGCCACAGGGAGCTTCAATGC 

TTGCACCACGGTT^^^**^^^^^^'^^^^^^^^^^^^^^^^ 
TTGCACCAGGGTTCCTAGTGGCAGTAAAGCGACTTGCTGTAGGACGTTTT 

TGTTTCTGATCTATAACTATTTGCCAGGTGGTAATTTGGAAAAGTTTATT 

^gagaggtctacaagggctgtggactggagggttcttcacaagISgJ 
ttoggatgtagcccgtgc^cttgcttacctgcatgaSSg^Sa? 

10 TATaa<Pf^^^^^ A ^^ AA ^^^ A ^^ A ^^^^^^^^^^^^GAG 
AGAGACCCATGCAACTACTGGTGTCGCGGGAACTTTTGGATATGTTGCT? 

ctg^tatgccatgacttgccgcxstctcggacaaggctgatctc^caS 

J«)^^^^^^^^^^^^^ G ^^^^^^^^^AAGAAAGCAC^GATCC 
GTCTTTCTCTTCTTATGGAAATGGATTCAATATTGTAGCTTGGGCATGCA 

ACGGTTGAAGCAACTTCAACCCCCGTCGTGTTAGCTGCGGCATGTC^ 

20 G ? A I AGGATATGGTT ^^ 

^^^^"^^^^ATTAGGTTCAGATTTGTATTTG?^ 
CTGCTTGTGAATTGTAGTATATAGCCAGCCCCC : ATTTTTCC • ATCTCAT 

gtcccctaattagggggtgtgcagattcttct^cagaagag-JSgSI 
CT tgtcttcaacatgtacc : acatttttttt TC ? C tc^^ GA J A 

25 aaaaataggaaccaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaaa 

SEQ ID NO: 2 0 
TRK2 



DSLCLSLFAAPKSRVRARVFEFQNPISGSYNLHIQGEFVSVAF^-DPFGVLLMGR 

RWALNITGGNLGSLSCAKIAQFPLYGFG 
ITRVCANNSV 

KLVGKVPLA 
35 ISKLTELRVLSLPF NELRGDIPLG 

IWDMDKLEVLDLQG NLITGSLPLE 

FKGLRKLRVLNLGF NQIVGAIPNS 

LSNCLALQIFNLAG NRVNGTIPAF 

I GGFEDLRG I YLS F NELSGSIPGE 
40 IGRSCEKLQSLEMAGNILGGVIPKS 

LGNCTRLQSLVLYS NLLEEAIPAE 

FGQLTELEILDLSR N5LSGRLP5E 

LGNCSKLSILVLSSLWDP 

LPNVSDSAHTTDEF NFFEGTIPSE 
45 ITRLPSFENDMAPR STLSGKFPGS 

WGACDNLEIVNLAQ NYYTGVIPEE 

LGSCQKLHFLDLSS NRLTGQLVEK 

LPVPCMFVFDVSG NYLSGS I PRF 

SO SNYSCAHWSSGGEPFGPYDTSSAYLAHFTSRSVLDTTLF 
5U AGDGNHAVFHNFGV NNFTGNLPPS 

MLIAPEMLGKQIVYAFLAGSNRFTGPFAGNL 
FEKCHELNGMIVNVSNNALSGQIPED 
IGAICGSLRLLDGSKNQIVGTVPPS 
LGSLVSLVALNLSW NHLRGQIPSR 
LGQIKDLSYLSLAG NNLVGPIPSS 
FGQLHSLETLELSS NSLSGEIPNN 
LVNLRNLTSLLLNN NNLSGKIPSG 
LANVTTLAAFNVSF NNLSGPLPLN 
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^^™^^ NPFL C SC HVFSLSTPSTDQQGRICDSQDSAASPSGSTQKGCSSCFNSIEIASITSAAAIVSVI.LA 
PJWNPRSRVAGSTRKEVTVFTEVPVPLTFE 

NVyRATGSFNASNCIGSGGFGATYKAEIAPGFLVAVKRLAVGRFQGIQQFDAEIRTLG 

RLRHPNLVTLIGYHNSETEMFLIYNYLPGGNLEKFIQERSTRAVDWRVLHKIALDVAR 

ALAYLHDQCVPRVLHRDVXPSNILLDEEYNAYLSDFGLARLLGT 

S BTHATTGVAGTFG YVAPE YAMTCR VS DKAD VY S YGVVLLEL I S 

OKKALOPSFSSYGNGPN1VAWACMLLRRAVLRSSLRLVYGIQVH 

MMIWMRSYTWQWSARLTLFLLDQQ 

-SK. .DG.SNFNPRRVSCGMCFG.DMV. PNCN?KTCP. . .GVFGCLVLGSDLYL. PACE 

L. YIASPJFF7CHV? . LGGVQIL7AEECRYLSSTCT? IFFCLLNKSKK. EPKKKKKKKKKK7 7 

SEQ ZD NO: 21 

TRK3 

TCAAGCAACAATTTGTCAGGCCAGATTCCCAAGTCCTTGAGGAATCTTGAACATCTCATGTATTTCAATGTCTCGT 

TCa^ATGGCCTCATGGGTGAAATTCCAGATGCAGGGCCATTCGTAAATTTTACAGCTGAATCATTCATGGGTAACCC 

TGCATTATGTGGATCAT<»CGCTTCCGTGTGATGCAATGCAGAGTCACTAGTCTTGAAAGAAAAGGAAAGAGTAGA 

GTCTTAACTTCTGTTCTTGCATCAGCTTCCTCAGGAGTTGTAGTCACGACCATTTTCATCATTTGGTTTCTGAAAT 

GCCGAAAAAGGAGTACGGAACTTCCTCTAGTTGATACATTTGGTCAGGTACATAAGAGGATTTCGTACTATGATAT 

TCCTCAAGGGACAAACAGCTTTGATGAAGCAAACTTGATTGGAAGGGGGAGCCTTGGTTTGGTCTACAAAGGAAAG 
CTTG AAAATCCTAAGCG 

SEQ ID NO: 22 

TRK3 

SSNNLSGQIPKSLRNLEHLMYFNVSFNGLMGEIPDGGPFVNFTAESFMGNPALCGSSRFRVMQCRVTSLERKGKSR 

VLTSVLASASSGWVTTIFIIWFLKCRKRSTEt.PLVDTFGQVHKRISYYDIPQGTNSFDEANLIGRGSLLGt.VYKG 
KXJ5NPKR 



30 SEQ ID NO: 23 
TRK4 

AGCTTGGATCAATTTAACAATGATAAACTGCAATTTAGCTGGTCCTTTGCCTGAATTTCTTGGAACTATGTCTTCT 
TTAGAGGTTTTGTTGTTGTCTACAAATAGGCTTTCAGGGCCTATTCCAGGTACTTTCAAGGATGCAGTGCTGAAGA 

^^ATCTTTGGCTTCATGGGAATCAATTTTCAGGTAAAATCCCAGTAGAGATTGGTAATCTAACAAATCTGAAGGAT 
CTCAGTGTGAATACAAATAACCTTGTTGGATTAATCCCTGAAAGTTTAGCTAATATGCCATTAGACAATCTTGATT 
TGAATAATAATCATTTTATGGGACCAGTTCCTAAGTTCAAGGCTACTAATGTTAGTTTTATGTCCAACTCTTTTTG 
TCAAACCAAACAAGGAGCAGTATGTGCCCCTGAGGTTATGGCACTTTTAGAGTTTCTTGATGGGGTGAATTATCCT 
TCTAGGCTTGTTCAATCATGGTCTGGAAACAACCCTTGTGACGGACGTTGGTCGGCAATAACCTGTGACGATAACC 
AAAAAGTTAGTGTTATAAACTTGCCCAAGTCTAATCTTTCCGGGACCTTGAGTCCTTCAATCGCGAACCTTGAAAC 
CGTTACTC^CATTTATCTTGAATCAAATAATCTTTCTGGTTTTGTTCCATCTAGTTGGACTAGTTTGAAATCTCTG 
TCTATTCTTGATTTGAGTAATAACAATATTTCCCCACCTTTGCCTAAATTTACCACCCCTTTGAAACTTGTTCTAA 
ATGGAAATCCAAAGCTGACTTCTAATCCTCCTGGAGCAAATCCTTCACCAAACAACAGCACAACTCCTGCAGATTC 
ACCCACGTCGTCTGTACCATCTTCACGACCCAACAGTTCAAGCTCTGTGATCTTTAAACCCAGTGAACAGTCACCC 
GAGAAAAAGGACTCAAAGTCAAAGATAGCTATAGTTGTGGTTCCTATTGCTGGTTTTCTACTTTTGGTTTGTCTTG 
CTATTCCACTGTACATTTATGTCTGTAAGAAGAGTAAAGATAAGCATCAAGCTCCAACTGCTCTTGTGGTTCATCC 
TAGAGATCCGTCTGATTCGGATAATGTAGTCAAGATTGCGATTGCCAATCAGACTAATGGAAGTCTTTCCACAGTA 
AATGCAAGTGGTTCTGCTAGCATACACAGTGGTGAATCCCATTTGATCGAAGCTGGGAATTTGCTCATATCGGTTC 
AAGTACTTCGGAATGTGACTAAGAATTTTTCTCCGGAAAATGAACTTGGACGTGGTGGTTTTGGTGTGGTTTATAA 
GGGAGAATTAGATGATGGGACACGAATCGCTGTCAAAAGAATGGAGGCTGGTATTGTTAGCAACAAAGCT 

SEQ ID NO:24 
TRK4 aa 

AWINLTMINCNLAGPLPEFLGTMSSLEVLLLSTNRLSGPIPGTFKDAVLKMLWLNDQSGDGMSGSIDWATMVSLT 
HLWLHGNQFSGKIPVEIGNLTNLKDLSVNTNNLVGLIPESLANMPLDNLDLNNNHFMGPVPKFKATNVSFMSNSFC 
QTKQGAVCAPEVMALLEFLDGVNYPSRLVESWSGNNPCDGRWWGISCDDNQKVSVINLPKSNLSGTLSPSIANLET 
VTHIYLESNNLSGFVPSSWTSLKSLSILDLSNNNISPPLPKFTTPLKLVLNGNPKLTSNPPGANPSPNNSTTPADS 
PTSSVPSSRPNSSSSVIFKPSEQSPEWCDSKSKIAIVVrvPIAGFLLLVCLAIPLYIYVCKKSKDKHQAPTALVVHP 
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SEQ ZD NO: 25 
TRXS 3' 




ACTCCTTCTATCATTACATCTTTTTT^ 
^TTTTAAAATTATTCGATCA^C^^^^ 

TTATAAAAGATAAGTTTCCATTTC^ 

ctactggacgagtcacaaccaaISS^^ 

AGT ^ C ^CAAAGTAGACGTTTACCTTTGGCGTTGTTTTGATGGAGATCATTACTGGTAGAAAAG 



SEQ ID NO: 26 

TRK5 S' 



TCTCTATATGCTAGTA^ 



SEQ ID NO;27 
TRK5 5' aa 



SEQ XD NO: 28 
TRK6 3 



^^^^^^^^^^^^ 
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SEQ ID NO: 2 9 
TRK7 2' 

r C ^f3Tf^f T ^ G:mG ^ C GAATTCATGTTACTTTGAGTATC^ 



GGTA^^TM GTG ^ AAC ^ 
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